[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-11-09 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@amansinha100 thanks for sharing the information.  Got your point. I think 
your propose on  
[CALCITE-1048](https://issues.apache.org/jira/browse/CALCITE-1048) is possible. 
Since [CALCITE-794](https://issues.apache.org/jira/browse/CALCITE-794) has 
completed at version 1.6 ,it seems there's a more perfect solution( to get the 
least max number of all the rels of the RelSubSet). But due to Drill's Caclite 
version is still based on 1.4 , I support your current temp solution. Only 
wonder that whether the explicitly searched RelNode's (such as 
DrillAggregateRel) maxRowCount can represent the best RelNode's maxRowCount ?  


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-11-08 Thread amansinha100
Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/889
  
@weijietong I am interested in getting your basic changes in.  It is 
unfortunate we are running into this issue with RelSubSet.  Let me see if I can 
make some changes on top of your changes  (I will keep your authorship) and 
have it pass all our functional regression tests. 


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-11-08 Thread amansinha100
Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/889
  
@weijietong I ran the functional tests with the PR and saw some failures.  
While debugging those failures, I found the simple scalar aggregate case works 
but anything with compound aggregate expressions  throws a CannotPlanException, 
for example:  

explain plan for select count(*) from cp.`tpch/lineitem.parquet` l where 
l_quantity < (select max(l2.l_quantity) + 1 from cp.`tpch/lineitem.parquet` l2)
 
Here, the right side of the inequality join is actually a scalar but the 
JoinUtils.isScalar() method does not detect it correctly.  The reason is the 
right side is a DrillProjectRel whose input is a RelSubSet.  The RelSubSet has 
a DrillAggregateRel but it looks like currently Calcite does not detect this is 
a scalar aggregate if it occurs as part of a RelSubSet.   I see Calcite has a 
comment in [1] which points to an open JIRA CALCITE-1048.  

[1] 
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMdMaxRowCount.java#L165


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-11-03 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
ping @arina-ielchiieva @amansinha100 


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-10-28 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@amansinha100  I have removed the override of `getMaxRowCount` method of 
`TableScan` type to avoid your worry about unexpected results . This PR's 
target adjusts to just enhance the current cartesian join by leveraging 
Calcite's `RelMdMaxRowCount` method. I will solve our case specially at our env.


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-10-16 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva, @amansinha100 can you take another look at this one and 
see if your concerns have been addressed?


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-09-15 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva @amansinha100 any further advice ?


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-09-09 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva, looks like Weijie added a couple of commits since your 
+1. Can you take a look at them?


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-09-06 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva  @amansinha100 isScalarSubquery method has been 
refactored , please review.


---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-23 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/889
  
+1, LGMT.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-23 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva thanks for the advice, has corrected that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-21 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/889
  
Thanks @weijietong. LGTM.
@amansinha100 could you please do the final CR since you are familiar with 
this issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-20 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva other tests have passed separately and time zone related 
test cases have been corrected in PR 5717


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/889
  
On mine environment all tests pass. I used to have troubles but long time 
ago. I had to fix issues with timezones, memory usage `MAVEN_OPTS="-Xmx2048m 
-XX:MaxPermSize=1024m"`. Also I am pretty sure that some test have failed 
because the others did, usually the ones with Rpc exception. Maybe  `UserRemote 
SYSTEM ERROR: URISyntaxException` is connected with your env. You should check 
that. Also please rebase on the latest master (you can do `push -f` again), 
also if `TestTpchDistributedConcurrent.testConcurrentQueries` will be failing 
again, try to disable it but make sure it passes separately.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-10 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva  have passed all the unit tests under java-exec bundle , 
got some errors which are sure not associated with this change, maybe my branch 
was old from the master.
>
  TestConstantFolding.testConstantFolding_allTypes:163 »  
org.apache.drill.commo...
  TestCustomUserAuthenticator.positiveUserAuth » UserRemote SYSTEM ERROR: 
URISyn...
  TestCustomUserAuthenticator.positiveUserAuthAfterNegativeUserAuth » 
UserRemote
  TestInfoSchema.selectFromAllTables » UserRemote SYSTEM ERROR: 
URISyntaxExcepti...
  
TestViewSupport.infoSchemaWithView:355->BaseTestQuery.testRunAndReturn:331 » 
Rpc
  TestInfoSchemaFilterPushDown.testFilterPushdown_NonEqual » UserRemote 
SYSTEM E...
  TestParquetScan.testSuccessFile:64->BaseTestQuery.testRunAndReturn:331 » 
Rpc o...
  TestTpchDistributedConcurrent.testConcurrentQueries:190 »  test timed 
out afte...

I also found the error to do `mvn test` ,got the same error as JIRA 
DRILL-4104 . through `mvn test -pl exec/java-exec` this method,I got the unit 
test result. I wonder how the devs do the test result .





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-09 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/889
  
@weijietong thanks for code formatting.

> I wonder if the enhanced code does not break the current unit test ,it 
will be ok.
Can you please check this? Did all unit tests pass after the change?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-08 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva  your test case can not reproduce the error . You can 
search the dev email to find the origin error description with the keyword 
"Drill query planning error".  Your query already satisfy the 
NestedLoopJoinPrule. My case is that I add another rule to change the 
Aggregate-->Aggregate-->Scan to Scan as the transformed Scan relnode already 
holding the count(distinct ) value. When this transformation occurs, the 
NestedLoopJoinPrule's checkPreconditions method will invoke 
JoinUtils.hasScalarSubqueryInput. Then it will fail, as the transformed relnode 
has no aggregate node which does not satisfy the current scalar rule. 

I think it's hard to reproduce this error without a specific rule like what 
I do. the precondition is:
1. a nested loop join
2. no (aggregate--> aggregate) count distinct relation nodes in the plan
3. the row number of one child of the nested loop join is 1 .

I wonder if the enhanced code does not break the current unit test ,it will 
be ok.





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-08 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/889
  
@weijietong regarding the unit test, I have tried to reproduce the problem 
and written the following unit test:
```
  @Test
  public void test() throws Exception {
FileSystem fs = null;
try {
  fs = FileSystem.get(new Configuration());

  // create table with partition pruning
  test("use dfs_test.tmp");
  String tableName = "table_with_pruning";
  Path dataFile = new 
Path(TestTools.getWorkingPath(),"src/test/resources/parquet/alltypes_required.parquet");
  test("create table %s partition by (col_int) as select * from 
dfs.`%s`", tableName, dataFile);

  // generate metadata
  test("refresh table metadata `%s`", tableName);
  
  // execute query
  String query = String.format("select count(distinct col_int), 
count(distinct col_chr) from `%s` where col_int = 45436", tableName);
  test(query);

} finally {
  if (fs != null) {
fs.close();
  }
}
  }
```
`AbstractGroupScan.getScanStats` method returns one row but it does not 
fail. Can you please take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-04 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
@arina-ielchiieva  I have corrected the codes as you guide. But sorry for 
the unit tests, I have tried a long time to simulate a scan with 1 row ,but 
failed to do that.  The row count of a scan is fetched from the 
AbstractGroupScan.getScanStats method. At my plugin ,I override this method to 
ensure it will return 1 .  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-08-02 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/889
  
@weijietong thanks for the PR.
1. Is it possible to add unit tests?
2. Can you please add in method description that a table scan with only one 
row should also be considered as a scalar as well.
3. Please fix indention in `if` statements and change `currentrel` to camel 
case [1].

[1] https://drill.apache.org/docs/apache-drill-contribution-guidelines/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---