[GitHub] incubator-trafodion pull request: [TRAFODION-1506] TRAFCI on windo...

2015-10-05 Thread hegdean
GitHub user hegdean opened a pull request:

https://github.com/apache/incubator-trafodion/pull/103

[TRAFODION-1506] TRAFCI on windows is missing the banner and copyright 
information

Fixed the version information for source package
Fixed other minor issues

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hegdean/incubator-trafodion wrk-brnch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #103


commit 4e2784214bad33882719e766f8b3e5798f3301a1
Author: Anuradha Hegde 
Date:   2015-10-05T17:32:40Z

Changed the source package name to contain the version info
Fixed trafci banner issue for non-unix machines
Fixed other minor issues

commit b209f13b2eba327fdec3fc974b85afe4506f1438
Author: Anuradha Hegde 
Date:   2015-10-05T19:05:42Z

compressed the src package




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-trafodion pull request: [TRAFODION-1506] TRAFCI on windo...

2015-10-05 Thread hegdean
GitHub user hegdean opened a pull request:

https://github.com/apache/incubator-trafodion/pull/106

[TRAFODION-1506] TRAFCI on windows is missing the banner and copyright 
information O added copyright to template file Changed the source packaging tar 
file name



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hegdean/incubator-trafodion wrk-brnch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/106.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #106


commit 7c7245e821b09dd2c1ae45c3a02912123fadc13a
Author: Anuradha Hegde 
Date:   2015-10-06T04:51:56Z

Added copyright to template file
Fixed the package name for the src tar file




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-trafodion pull request: [TRAFODION-25] First draft of co...

2015-10-05 Thread zellerh
Github user zellerh commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/102#discussion_r41206233
  
--- Diff: core/sql/optimizer/ScmCostMethod.cpp ---
@@ -3632,6 +3632,561 @@ 
CostMethodMergeUnion::scmComputeOperatorCostInternal(RelExpr *op,
   
 //
 
+// ---
+// Cost methods for write DML operations
+// ---
+
+
+// ---
+// This method is a stub for obsolete old cost model
+// ---
+Cost*
+CostMethodHbaseUpdateOrDelete::computeOperatorCostInternal(RelExpr* op,
+  const Context * context,
+  Lng32& countOfStreams)
+{
+  CMPASSERT(false);  // should never be called
+  return NULL;
+}
+
+// ---
+// CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics()
+//
+// Returns TRUE if all key columns have histograms, FALSE if not.
+// ---
+NABoolean 
CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics(
+  const IndexDescHistograms & histograms,
+  const IndexDesc * CIDesc
+  ) const
+{
+  // Check if all key columns have histogram statistics
+  NABoolean statsForAllKeyCols = TRUE;
+  for ( CollIndex j = 0; j < CIDesc->getIndexKey().entries(); j++ )
+  {
+if (histograms.isEmpty())
+{
+  statsForAllKeyCols = FALSE;
+  break;
+}
+else if (!histograms.getColStatsPtrForColumn((CIDesc->getIndexKey()) 
[j]))
+{
+  // If we get a null pointer when we try to retrieve a
+  // ColStats for a column of this table, then no histogram
+  // data was created for that column.
+  statsForAllKeyCols = FALSE;
+  break;
+}
+  }
+
+  return statsForAllKeyCols;
+}   // 
CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics()
+
+// ---
+// 
CostMethodHbaseUpdateOrDelete::numRowsToScanWhenAllKeyColumnsHaveHistograms()
+//
+// Returns an estimate of the number of rows that will be scanned as a
+// result of applying key predicates. Assumes that histograms exist for
+// all key columns.
+// ---
+#pragma nowarn(262)   // warning elimination
+CostScalar

+CostMethodHbaseUpdateOrDelete::numRowsToScanWhenAllKeyColumnsHaveHistograms(
+  IndexDescHistograms & histograms,
+  const ColumnOrderList & keyPredsByCol,
+  const CostScalar & activePartitions,
+  const IndexDesc * CIDesc
+  ) const
+{
+
+  // Determine if there are single subset predicates:
+  CollIndex singleSubsetPrefixOrder;
+  NABoolean itIsSingleSubset =
+keyPredsByCol.getSingleSubsetOrder( singleSubsetPrefixOrder );
+
+  NABoolean thereAreSingleSubsetPreds = FALSE;
+  if ( singleSubsetPrefixOrder > 0 )
+  {
+thereAreSingleSubsetPreds = TRUE;
+  }
+  else
+  {
+//  singleSubsetPrefixOrder==0  means either there
+// is an equal, an IN,  or there are no key preds in the
+// first column.
+// singleSubsetPrefixOrder==0 AND itIsSingleSubset
+// means there is an EQUAL or there are no key preds
+// in the first column, check for existance of
+// predicates in this case:
+if ( itIsSingleSubset // this FALSE for an IN predicate
+AND keyPredsByCol[0] != NULL
+   )
+{
+  thereAreSingleSubsetPreds = TRUE;
+}
+  }
+
+
+  CMPASSERT(NOT histograms.isEmpty());
+
+  if ( thereAreSingleSubsetPreds )
+  {
+// -
+// There are some key predicates, so apply them
+// to the histograms and get the total rows:
+// -
+
+// Get all the key preds for the key columns up to the first
+// key column with no key preds
+ValueIdSet singleSubsetPrefixPredicates;
+NABoolean conflict = FALSE;
+for ( CollIndex i = 0; i <= singleSubsetPrefixOrder; i++ )
+{
+
+  const ValueIdSet *predsPtr = keyPredsByCol[i];
+  CMPASSERT( predsPtr != NULL ); // it must contain preds
+  singleSubsetPrefixPredicates.insert( *predsPtr );
+
+} // for every key col in the sing. subset prefix
+
+// Apply those key predicates that reference key columns
+// before the 

[GitHub] incubator-trafodion pull request: [TRAFODION-25] First draft of co...

2015-10-05 Thread zellerh
Github user zellerh commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/102#discussion_r41206622
  
--- Diff: core/sql/optimizer/ScmCostMethod.cpp ---
@@ -3632,6 +3632,561 @@ 
CostMethodMergeUnion::scmComputeOperatorCostInternal(RelExpr *op,
   
 //
 
+// ---
+// Cost methods for write DML operations
+// ---
+
+
+// ---
+// This method is a stub for obsolete old cost model
+// ---
+Cost*
+CostMethodHbaseUpdateOrDelete::computeOperatorCostInternal(RelExpr* op,
+  const Context * context,
+  Lng32& countOfStreams)
+{
+  CMPASSERT(false);  // should never be called
+  return NULL;
+}
+
+// ---
+// CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics()
+//
+// Returns TRUE if all key columns have histograms, FALSE if not.
+// ---
+NABoolean 
CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics(
+  const IndexDescHistograms & histograms,
+  const IndexDesc * CIDesc
+  ) const
+{
+  // Check if all key columns have histogram statistics
+  NABoolean statsForAllKeyCols = TRUE;
+  for ( CollIndex j = 0; j < CIDesc->getIndexKey().entries(); j++ )
+  {
+if (histograms.isEmpty())
+{
+  statsForAllKeyCols = FALSE;
+  break;
+}
+else if (!histograms.getColStatsPtrForColumn((CIDesc->getIndexKey()) 
[j]))
+{
+  // If we get a null pointer when we try to retrieve a
+  // ColStats for a column of this table, then no histogram
+  // data was created for that column.
+  statsForAllKeyCols = FALSE;
+  break;
+}
+  }
+
+  return statsForAllKeyCols;
+}   // 
CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics()
+
+// ---
+// 
CostMethodHbaseUpdateOrDelete::numRowsToScanWhenAllKeyColumnsHaveHistograms()
+//
+// Returns an estimate of the number of rows that will be scanned as a
+// result of applying key predicates. Assumes that histograms exist for
+// all key columns.
+// ---
+#pragma nowarn(262)   // warning elimination
+CostScalar

+CostMethodHbaseUpdateOrDelete::numRowsToScanWhenAllKeyColumnsHaveHistograms(
+  IndexDescHistograms & histograms,
+  const ColumnOrderList & keyPredsByCol,
+  const CostScalar & activePartitions,
+  const IndexDesc * CIDesc
+  ) const
+{
+
+  // Determine if there are single subset predicates:
+  CollIndex singleSubsetPrefixOrder;
+  NABoolean itIsSingleSubset =
+keyPredsByCol.getSingleSubsetOrder( singleSubsetPrefixOrder );
+
+  NABoolean thereAreSingleSubsetPreds = FALSE;
+  if ( singleSubsetPrefixOrder > 0 )
+  {
+thereAreSingleSubsetPreds = TRUE;
+  }
+  else
+  {
+//  singleSubsetPrefixOrder==0  means either there
+// is an equal, an IN,  or there are no key preds in the
+// first column.
+// singleSubsetPrefixOrder==0 AND itIsSingleSubset
+// means there is an EQUAL or there are no key preds
+// in the first column, check for existance of
+// predicates in this case:
+if ( itIsSingleSubset // this FALSE for an IN predicate
+AND keyPredsByCol[0] != NULL
+   )
+{
+  thereAreSingleSubsetPreds = TRUE;
+}
+  }
+
+
+  CMPASSERT(NOT histograms.isEmpty());
+
+  if ( thereAreSingleSubsetPreds )
+  {
+// -
+// There are some key predicates, so apply them
+// to the histograms and get the total rows:
+// -
+
+// Get all the key preds for the key columns up to the first
+// key column with no key preds
+ValueIdSet singleSubsetPrefixPredicates;
+NABoolean conflict = FALSE;
+for ( CollIndex i = 0; i <= singleSubsetPrefixOrder; i++ )
+{
+
+  const ValueIdSet *predsPtr = keyPredsByCol[i];
+  CMPASSERT( predsPtr != NULL ); // it must contain preds
+  singleSubsetPrefixPredicates.insert( *predsPtr );
+
+} // for every key col in the sing. subset prefix
+
+// Apply those key predicates that reference key columns
+// before the 

[GitHub] incubator-trafodion pull request: [TRAFODION-1506] TRAFCI on windo...

2015-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-trafodion/pull/103


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-trafodion pull request: [TRAFODION-25] First draft of co...

2015-10-05 Thread zellerh
Github user zellerh commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/102#discussion_r41206781
  
--- Diff: core/sql/optimizer/ScmCostMethod.cpp ---
@@ -3632,6 +3632,561 @@ 
CostMethodMergeUnion::scmComputeOperatorCostInternal(RelExpr *op,
   
 //
 
+// ---
+// Cost methods for write DML operations
+// ---
+
+
+// ---
+// This method is a stub for obsolete old cost model
+// ---
+Cost*
+CostMethodHbaseUpdateOrDelete::computeOperatorCostInternal(RelExpr* op,
+  const Context * context,
+  Lng32& countOfStreams)
+{
+  CMPASSERT(false);  // should never be called
+  return NULL;
+}
+
+// ---
+// CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics()
+//
+// Returns TRUE if all key columns have histograms, FALSE if not.
+// ---
+NABoolean 
CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics(
+  const IndexDescHistograms & histograms,
+  const IndexDesc * CIDesc
+  ) const
+{
+  // Check if all key columns have histogram statistics
+  NABoolean statsForAllKeyCols = TRUE;
+  for ( CollIndex j = 0; j < CIDesc->getIndexKey().entries(); j++ )
+  {
+if (histograms.isEmpty())
+{
+  statsForAllKeyCols = FALSE;
+  break;
+}
+else if (!histograms.getColStatsPtrForColumn((CIDesc->getIndexKey()) 
[j]))
+{
+  // If we get a null pointer when we try to retrieve a
+  // ColStats for a column of this table, then no histogram
+  // data was created for that column.
+  statsForAllKeyCols = FALSE;
+  break;
+}
+  }
+
+  return statsForAllKeyCols;
+}   // 
CostMethodHbaseUpdateOrDelete::allKeyColumnsHaveHistogramStatistics()
+
+// ---
+// 
CostMethodHbaseUpdateOrDelete::numRowsToScanWhenAllKeyColumnsHaveHistograms()
+//
+// Returns an estimate of the number of rows that will be scanned as a
+// result of applying key predicates. Assumes that histograms exist for
+// all key columns.
+// ---
+#pragma nowarn(262)   // warning elimination
+CostScalar

+CostMethodHbaseUpdateOrDelete::numRowsToScanWhenAllKeyColumnsHaveHistograms(
+  IndexDescHistograms & histograms,
+  const ColumnOrderList & keyPredsByCol,
+  const CostScalar & activePartitions,
+  const IndexDesc * CIDesc
+  ) const
+{
+
+  // Determine if there are single subset predicates:
+  CollIndex singleSubsetPrefixOrder;
+  NABoolean itIsSingleSubset =
+keyPredsByCol.getSingleSubsetOrder( singleSubsetPrefixOrder );
+
+  NABoolean thereAreSingleSubsetPreds = FALSE;
+  if ( singleSubsetPrefixOrder > 0 )
+  {
+thereAreSingleSubsetPreds = TRUE;
+  }
+  else
+  {
+//  singleSubsetPrefixOrder==0  means either there
+// is an equal, an IN,  or there are no key preds in the
+// first column.
+// singleSubsetPrefixOrder==0 AND itIsSingleSubset
+// means there is an EQUAL or there are no key preds
+// in the first column, check for existance of
+// predicates in this case:
+if ( itIsSingleSubset // this FALSE for an IN predicate
+AND keyPredsByCol[0] != NULL
+   )
+{
+  thereAreSingleSubsetPreds = TRUE;
+}
+  }
+
+
+  CMPASSERT(NOT histograms.isEmpty());
+
+  if ( thereAreSingleSubsetPreds )
+  {
+// -
+// There are some key predicates, so apply them
+// to the histograms and get the total rows:
+// -
+
+// Get all the key preds for the key columns up to the first
+// key column with no key preds
+ValueIdSet singleSubsetPrefixPredicates;
+NABoolean conflict = FALSE;
+for ( CollIndex i = 0; i <= singleSubsetPrefixOrder; i++ )
+{
+
+  const ValueIdSet *predsPtr = keyPredsByCol[i];
+  CMPASSERT( predsPtr != NULL ); // it must contain preds
+  singleSubsetPrefixPredicates.insert( *predsPtr );
+
+} // for every key col in the sing. subset prefix
+
+// Apply those key predicates that reference key columns
+// before the 

[GitHub] incubator-trafodion pull request: [TRAFODION-25] First draft of co...

2015-10-05 Thread sureshsubbiah
Github user sureshsubbiah commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/102#discussion_r41210123
  
--- Diff: core/sql/generator/GenPreCode.cpp ---
@@ -4385,12 +4377,14 @@ RelExpr * GenericUpdate::preCodeGen(Generator * 
generator,
  GenExit();
}
 
-  if ((getOperatorType() == REL_HBASE_DELETE) &&
+  if ((getOperatorType() != REL_HBASE_UPDATE) &&
--- End diff --

Just curious. What value can getOperatorType() have at this point in code 
other than REL_HBASE_DELETE and REL_HBASE_UPDATE?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-trafodion pull request: [TRAFODION-25] First draft of co...

2015-10-05 Thread DaveBirdsall
GitHub user DaveBirdsall opened a pull request:

https://github.com/apache/incubator-trafodion/pull/102

[TRAFODION-25] First draft of costing code for DELETE.

This is the first draft of costing code for the Delete operator
in Trafodion. Formerly a cost of 1 was returned for the Delete
operator. With this change, we can expect better plans for DELETE
statements.

Part of the code for the Update operator is in this check-in as
well, but the main part of the Update code remains to be written.

Two CQDs have been added to control execution of this code.

CQD HBASE_DELETE_COSTING 'ON' causes the new Delete costing code
to be executed. A value of 'OFF' causes the old stub logic (a
cost of 1) to be executed instead. The default for this CQD is
'ON', so this code can be exercised.

CQD HBASE_UPDATE_COSTING 'ON' causes partial Update costing code
to be executed. This results in a stub cost object of 1e32 to
be returned. (Actually, this activates a stub that was already
there in the code; this turned out to be useful in discovering
certain bugs, fixed below.) A value of 'OFF' causes earlier 
stub logic (a cost of 1) to be executed instead. The default
for this CQD is 'OFF' for now.

Once this code has been sufficiently validated, these CQDs will be
removed.

Several bugs are also fixed in this check-in. All are latent,
existing bugs that became more visible when regression-testing the
new costing code due to plan changes.

1. MERGE with a NOT MATCHED action was being (incorrectly)
transformed into a nested join in some plans. This caused the
NOT MATCHED action to not be executed. (optimizer/TransRule.cpp,
optimizer/RelUpdate.h)

2. MERGE error messages generated by the compiler were sometimes
incorrect. (optimizer/BindRelExpr.cpp, generator/GenPreCode.cpp)

3. A check constraint error message was not being reported when
it should have on a table with SERIALIZED columns. This was caused
by an HBase filter expression being generated with operators
having the wrong sense (e.g., < instead of >).
(generator/GenPreCode.cpp)

4. CLEANUP would fail if the record in the OBJECTS_UNIQ_IDX index
was missing. This happened because it was using a vanilla DELETE
and the plan for the DELETE changed with the costing changes to 
use index access. The fix was to use CQDs to avoid accessing the
index when deleting from the OBJECTS table.
(sqlcomp/CmpSeabaseDDLcleanup.cpp)

5. INITIALIZE AUTHORIZATION would fail if there was a row in
the OBJECTS table with an object_owner that does not exist in
the AUTHS table. (The latter can happen in rare circumstances
when toggling authorization off and on repeatedly. That is a
bug in its own right that remains outstanding.) This has been
worked around by writing 'DB__ROOT' to the OBJECT_PRIVILEGES
table when the object_owner is invalid.

Some other issues were noticed during unit testing that will
be addressed in later check-ins:

1. Scan costs for DELETEs using TSJ plans are grossly over-
estimated. The cause is that the generated join predicates between
the Scan and Delete operators are being assigned a default
selectivity of 1/3. This is an existing, latent bug.

2. The HBaseDeleteRule in the optimizer considers a Delete-only
plan (that is, without a TSJ of a Scan to a Delete) when there
are equality predicates on all key columns, and when there are
no predicates on key columns at all. The former is correct, the
latter is almost never a good plan. Costing should prevent the
latter from being chosen but still this is unnecessary extra
computation. (The rule does not consider Delete-only plans when
a proper subset of the key columns have equality predicates.)
This is an existing, latent bug.

 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DaveBirdsall/incubator-trafodion 
DeleteCostingDraft1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/102.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #102


commit 8e0b3d44efb149d033dfadcbed4f99639a0b53fe
Author: Dave Birdsall 
Date:   2015-10-01T22:12:46Z

[TRAFODION-25] First draft of DELETE costing changes

commit a543b588938cfd2b750ea1957ec982173fb6b3e6
Author: Dave Birdsall 
Date:   2015-10-02T22:48:10Z

Make INITIALIZE AUTHORIZATION more robust

commit aa0994e15698341727a1ec5414303a7cd5924727
Author: Dave Birdsall 
Date:   2015-10-02T23:42:27Z

Updating expected files with DELETE plan changes




---
If your project