[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-19 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248836#comment-15248836
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

Sure

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>Assignee: Sergey Soldatov
>  Labels: features, performance
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2743-1.patch, hivephoenixhandler.jstack
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-19 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248829#comment-15248829
 ] 

Josh Elser commented on PHOENIX-2743:
-

Alrighty, I just pushed this to master and 4.x-HBase-1.0. [~sergey.soldatov], 
this had lots of errors trying to bring it back to 4.x-HBase-0.98. Can you look 
at a patch for that, please?

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch, hivephoenixhandler.jstack
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248826#comment-15248826
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user asfgit closed the pull request at:

https://github.com/apache/phoenix/pull/155


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch, hivephoenixhandler.jstack
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-19 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248640#comment-15248640
 ] 

Josh Elser commented on PHOENIX-2743:
-

bq. Could it be that you have core-site.xml in your path/classpath and it 
interfere with the test? 

I unset all of my {{HADOOP_*}} environment variables and that did the trick. It 
seems like I should be filing a bug against MiniMR now (goody).

I'll see if I can come around with a quick workaround (or at least warning for 
others who might run into it) and then push these changes.

Thanks for your help, Sergey!

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch, hivephoenixhandler.jstack
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-19 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248624#comment-15248624
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

[~elserj] Nope, on my setup I had the only problem with number of open flies 
and it passed fine once I set ulimit -n. Could  it be that you have 
core-site.xml in your path/classpath and it interfere with the test? 

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch, hivephoenixhandler.jstack
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-19 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248614#comment-15248614
 ] 

Josh Elser commented on PHOENIX-2743:
-

I've gotten HivePhoenixStoreIT to pass in Eclipse, but I have still been unable 
to get it to pass on the commandline (same command as above). The MRAppMaster 
for (what I assume is) the hive job is failing to talk to HDFS, thinking that 
it should be at localhost:8020.

{noformat}
2016-04-19 16:48:46,179 ERROR [main] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Failed while 
checking for/creating  history staging path: 
[hdfs://localhost:8020/tmp/hadoop-yarn/staging/jelser/.staging/job_1461098862737_0002]
java.net.ConnectException: Call From myhostname/127.0.0.1 to localhost:8020 
failed on connection exception: java.net.ConnectException: Connection refused; 
For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
at org.apache.hadoop.ipc.Client.call(Client.java:1407)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.mkdir(JobHistoryEventHandler.java:267)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceInit(JobHistoryEventHandler.java:166)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:450)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1518)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529)
at 

[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247259#comment-15247259
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

GitHub user ss77892 opened a pull request:

https://github.com/apache/phoenix/pull/165

PHOENIX-2743 HivePhoenixHandler for big-big join with predicate push …

Rebased on the current master as a single patch.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ss77892/phoenix PHOENIX-2743-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/phoenix/pull/165.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #165


commit 684c127a936295a4bdd16f48ed1723dd35a40e6f
Author: Sergey Soldatov 
Date:   2016-04-19T06:36:36Z

PHOENIX-2743 HivePhoenixHandler for big-big join with predicate push down




> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247173#comment-15247173
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

Yep, that's because of the recent commit about splits. Got it fixed, testing at 
the moment. I will make a new pull request rebased on the master.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247160#comment-15247160
 ] 

Josh Elser commented on PHOENIX-2743:
-

[~sergey.soldatov], some compilation issues on the latest?

{noformat}
[INFO] -
[ERROR] COMPILATION ERROR :
[INFO] -
[ERROR] 
/Users/jelser/projects/phoenix.git/phoenix-hive/src/main/java/org/apache/phoenix/hive/mapreduce/PhoenixRecordReader.java:[116,65]
 no suitable constructor found for 
TableResultIterator(org.apache.phoenix.execute.MutationState,org.apache.phoenix.schema.TableRef,org.apache.hadoop.hbase.client.Scan,org.apache.phoenix.monitoring.CombinableMetric,long)
constructor 
org.apache.phoenix.iterate.TableResultIterator.TableResultIterator(org.apache.phoenix.execute.MutationState,org.apache.hadoop.hbase.client.Scan,org.apache.phoenix.monitoring.CombinableMetric,long,org.apache.phoenix.iterate.PeekingResultIterator,org.apache.phoenix.compile.QueryPlan,org.apache.phoenix.iterate.ParallelScanGrouper)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.phoenix.iterate.TableResultIterator.TableResultIterator(org.apache.phoenix.execute.MutationState,org.apache.hadoop.hbase.client.Scan,org.apache.phoenix.monitoring.CombinableMetric,long,org.apache.phoenix.compile.QueryPlan,org.apache.phoenix.iterate.ParallelScanGrouper)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.phoenix.iterate.TableResultIterator.TableResultIterator() is not 
applicable
  (actual and formal argument lists differ in length)
[INFO] 1 error
{noformat}

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246447#comment-15246447
 ] 

Josh Elser commented on PHOENIX-2743:
-

Fantastic. I will do the same!

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246440#comment-15246440
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

[~elserj] Pushed. Was trying to check that changing to 2.7.1 doesn't cause any 
other IT failures. Actually it's still running, but no new failures so far. 
Will update if anything else fail.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246404#comment-15246404
 ] 

Josh Elser commented on PHOENIX-2743:
-

bq. Got it. As for the exception - it comes that 2.5.1 hadoop-auth comes with 
hbase. Will fix those shortly.

Just so it's not silent: "ok". Just ping me when you get this updated -- I 
checked GH and didn't see any new commits. If it's helpful at all, I was using 
`mvn dependency:tree` on the command line to track this down myself.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246155#comment-15246155
 ] 

Josh Elser commented on PHOENIX-2743:
-

bq. This error usually happen when hadoop-auth version is different from 
hadoop-common

Yeah, right you are again. hbase dependencies are pulling in hadoop-2.5.1 still.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246123#comment-15246123
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

Interesting. This error usually happen when hadoop-auth version is different 
from hadoop-common. For last changes I run IT tests in the IDE, let me rerun it 
from maven. 

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246149#comment-15246149
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

Got it. As for the exception - it comes that 2.5.1 hadoop-auth comes with 
hbase. Will fix those shortly.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246124#comment-15246124
 ] 

Josh Elser commented on PHOENIX-2743:
-

{quote}
bq. I would assume that it's not necessary for any of them, but let me know 
if I'm wrong

Actually I added that only for the packages that are not in the dependencies in 
the root pom.xml. Not sure whether it's good to include them there since they 
are used in this package only and the module has it's own artifact. If we want 
to make it as a part of phoenix-client.jar, those dependencies can be moved to 
the root pom.xml and specify the version there.
{quote}

Ah! I'm with you now. Convention states that we should have all dependencies 
declared in dependencyManagement and just refer to dependencies by 
groupId:artifactId in sub-modules. I will clean that up too (super easy to 
change). Thanks.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246084#comment-15246084
 ] 

Josh Elser commented on PHOENIX-2743:
-

{noformat}
java.lang.RuntimeException: java.lang.NoSuchMethodError: 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.constructSecretProvider(Ljavax/servlet/ServletContext;Ljava/util/Properties;Z)Lorg/apache/hadoop/security/authentication/util/SignerSecretProvider;
Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.constructSecretProvider(Ljavax/servlet/ServletContext;Ljava/util/Properties;Z)Lorg/apache/hadoop/security/authentication/util/SignerSecretProvider;
{noformat}

Did you have all of the tests successfully running, [~sergey.soldatov]? It 
seems like there might be some lingering Hadoop version change issues. I'll 
look into it, but I just wanted to let you know :)

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246077#comment-15246077
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

{quote}
. I would assume that it's not necessary for any of them, but let me know if 
I'm wrong
{quote}
Actually I added that only for the packages that are not in the  dependencies 
in the root {{pom.xml}}. Not sure whether it's good to include them there since 
they are used in this package only and the module has it's own artifact. If we 
want to make it as a part of {{phoenix-client.jar}}, those dependencies can be 
moved to the root {{pom.xml}} and specify the version there. 



> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245935#comment-15245935
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on the pull request:

https://github.com/apache/phoenix/pull/155#issuecomment-211454505
  
@ss77892 one more nit-pick, it seems like you removed the version for some 
of the entries in the pom but then added `${hadoop-two.version}` for others. I 
would assume that it's not necessary for any of them, but let me know if I'm 
wrong. I can also just fix this while merging it in as long as @JamesRTaylor 
thinks this good to go too :)


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245945#comment-15245945
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user JamesRTaylor commented on the pull request:

https://github.com/apache/phoenix/pull/155#issuecomment-211456028
  
It's in your most capable hands, @joshelser & @SergeySoldatov. Thanks so 
much for whipping this into shape!


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243814#comment-15243814
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on the pull request:

https://github.com/apache/phoenix/pull/155#issuecomment-210678972
  
Recent improvements look good! The new integration test is *very* nice. The 
ObjectInspector stuff isn't a huge deal to replace, but it's something that can 
be consolidated later. Left a few minor comments, but I this is good to go 
after those are fixed.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243805#comment-15243805
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59953321
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java ---
@@ -6,9 +6,9 @@
  * to you under the Apache License, Version 2.0 (the
  * "License"); you may not use this file except in compliance
  * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
+ * 
--- End diff --

Adding in the paragraph tags might break the RAT check. Best to revert that.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243797#comment-15243797
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59953160
  
--- Diff: phoenix-hive/pom.xml ---
@@ -117,66 +93,44 @@
   
 
 
-  org.apache.hbase
-  hbase-common
-
-
-  org.apache.hbase
-  hbase-protocol
-
-
-  org.apache.hbase
-  hbase-client
-
-
-  org.apache.hbase
-  hbase-hadoop-compat
-  test
-
-
-  org.apache.hbase
-  hbase-hadoop-compat
-  test-jar
-  test
-
-
-  org.apache.hbase
-  hbase-hadoop2-compat
+  org.apache.hadoop
+  hadoop-hdfs
+  2.7.1
   test
 
 
-  org.apache.hbase
-  hbase-hadoop2-compat
+  org.apache.hadoop
+  hadoop-hdfs
+  2.7.1
--- End diff --

Given James' previous comment, I think you can safely change the hadoop 
version at the top-level pom. Then, you won't have to specify 2.7.1 down here 
in phoenix-hive.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239340#comment-15239340
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59558697
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/mapreduce/PhoenixResultWritable.java
 ---
@@ -0,0 +1,215 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive.mapreduce;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.sql.PreparedStatement;
+import java.sql.ResultSet;
+import java.sql.ResultSetMetaData;
+import java.sql.SQLException;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.conf.Configurable;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapreduce.lib.db.DBWritable;
+import org.apache.phoenix.hive.PhoenixRowKey;
+import org.apache.phoenix.hive.constants.PhoenixStorageHandlerConstants;
+import org.apache.phoenix.hive.util.PhoenixStorageHandlerUtil;
+import org.apache.phoenix.hive.util.PhoenixUtil;
+import org.apache.phoenix.util.ColumnInfo;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+
+/**
+ * Serialized class for SerDe
+ *
+ */
+public class PhoenixResultWritable implements Writable, DBWritable, 
Configurable {
+
+private static final Log LOG = 
LogFactory.getLog(PhoenixResultWritable.class);
+
+private List columnMetadataList;
+private List valueList;// for output
+private Map rowMap = Maps.newHashMap();  // for input
+
+private int columnCount = -1;
+
+private Configuration config;
+private boolean isTransactional;
+private Map rowKeyMap = Maps.newLinkedHashMap();
+private List primaryKeyColumnList;
+
+public PhoenixResultWritable() {
+}
+
+public PhoenixResultWritable(Configuration config) throws IOException {
+setConf(config);
+}
+
+public PhoenixResultWritable(Configuration config, List 
columnMetadataList) throws IOException {
+this(config);
+this.columnMetadataList = columnMetadataList;
+
+valueList = 
Lists.newArrayListWithExpectedSize(columnMetadataList.size());
+}
+
+@Override
+public void write(DataOutput out) throws IOException {
+}
--- End diff --

Maybe add `throws new UnsupportedOperationException()` then?


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239337#comment-15239337
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59558405
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/objectinspector/PhoenixObjectInspectorFactory.java
 ---
@@ -0,0 +1,150 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive.objectinspector;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.hive.serde2.lazy.LazySerDeParameters;
+import 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.ObjectInspectorOptions;
+import org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+
+public class PhoenixObjectInspectorFactory {
--- End diff --

Haha, ok. I want to say that I think they are superfluous -- Hive already 
provides implementations for the OI's. {HBase,Accumulo}StorageHandler should 
already have examples on how they work. It would reduce the amount of code to 
maintain here which is great.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238883#comment-15238883
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user ss77892 commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59513925
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/objectinspector/PhoenixObjectInspectorFactory.java
 ---
@@ -0,0 +1,150 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive.objectinspector;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.hive.serde2.lazy.LazySerDeParameters;
+import 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.ObjectInspectorOptions;
+import org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+
+public class PhoenixObjectInspectorFactory {
--- End diff --

Honestly speaking I'm not sure :) That's how it comes from the guys who 
contribute this code. 


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238858#comment-15238858
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user ss77892 commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59511702
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/mapreduce/PhoenixResultWritable.java
 ---
@@ -0,0 +1,215 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive.mapreduce;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.sql.PreparedStatement;
+import java.sql.ResultSet;
+import java.sql.ResultSetMetaData;
+import java.sql.SQLException;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.conf.Configurable;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapreduce.lib.db.DBWritable;
+import org.apache.phoenix.hive.PhoenixRowKey;
+import org.apache.phoenix.hive.constants.PhoenixStorageHandlerConstants;
+import org.apache.phoenix.hive.util.PhoenixStorageHandlerUtil;
+import org.apache.phoenix.hive.util.PhoenixUtil;
+import org.apache.phoenix.util.ColumnInfo;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+
+/**
+ * Serialized class for SerDe
+ *
+ */
+public class PhoenixResultWritable implements Writable, DBWritable, 
Configurable {
+
+private static final Log LOG = 
LogFactory.getLog(PhoenixResultWritable.class);
+
+private List columnMetadataList;
+private List valueList;// for output
+private Map rowMap = Maps.newHashMap();  // for input
+
+private int columnCount = -1;
+
+private Configuration config;
+private boolean isTransactional;
+private Map rowKeyMap = Maps.newLinkedHashMap();
+private List primaryKeyColumnList;
+
+public PhoenixResultWritable() {
+}
+
+public PhoenixResultWritable(Configuration config) throws IOException {
+setConf(config);
+}
+
+public PhoenixResultWritable(Configuration config, List 
columnMetadataList) throws IOException {
+this(config);
+this.columnMetadataList = columnMetadataList;
+
+valueList = 
Lists.newArrayListWithExpectedSize(columnMetadataList.size());
+}
+
+@Override
+public void write(DataOutput out) throws IOException {
+}
--- End diff --

We are using it in our PhoenixRecordWriter and we call write only with 
statement.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238852#comment-15238852
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user ss77892 commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59511383
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/ql/index/IndexPredicateAnalyzer.java
 ---
@@ -0,0 +1,486 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive.ql.index;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.Stack;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.hive.ql.exec.FunctionRegistry;
+import org.apache.hadoop.hive.ql.lib.DefaultGraphWalker;
+import org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher;
+import org.apache.hadoop.hive.ql.lib.Dispatcher;
+import org.apache.hadoop.hive.ql.lib.GraphWalker;
+import org.apache.hadoop.hive.ql.lib.Node;
+import org.apache.hadoop.hive.ql.lib.NodeProcessor;
+import org.apache.hadoop.hive.ql.lib.NodeProcessorCtx;
+import org.apache.hadoop.hive.ql.lib.Rule;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeDescUtils;
+import org.apache.hadoop.hive.ql.plan.ExprNodeFieldDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFBaseCompare;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFIn;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNot;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNotNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToBinary;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToChar;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToDate;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToDecimal;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUtcTimestamp;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToVarchar;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
+
+import com.google.common.collect.Lists;
+
+/**
+ * Clone of org.apache.hadoop.hive.ql.index.IndexPredicateAnalyzer with 
modifying
+ * analyzePredicate method.
+ *
+ *
+ */
+public class IndexPredicateAnalyzer {
+
+private static final Log LOG = 
LogFactory.getLog(IndexPredicateAnalyzer.class);
+
+private final Set udfNames;
+private final Map columnToUDFs;
+private FieldValidator fieldValidator;
+
+private boolean acceptsFields;
+
+public IndexPredicateAnalyzer() {
+udfNames = new HashSet();
+columnToUDFs = new HashMap();
+}
+
+public void setFieldValidator(FieldValidator fieldValidator) {
+this.fieldValidator = fieldValidator;
+}
+
+/**
+ * Registers a comparison operator as one which can be satisfied by an 
index
+ * search. Unless this is called, analyzePredicate will never find any
+ * indexable conditions.
+ *
+ * @param udfName name of 

[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236326#comment-15236326
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user JamesRTaylor commented on the pull request:

https://github.com/apache/phoenix/pull/155#issuecomment-208640594
  
I think it's ok to change Phoenix to specify Hadoop 2.7.1. There's no 
particular reason it's set to 2.5.1 (other than no one has changed it in a 
while). See PHOENIX-2761.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236323#comment-15236323
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user ss77892 commented on the pull request:

https://github.com/apache/phoenix/pull/155#issuecomment-208638163
  
Yep, I know. The problem is that hbase-it is built against 2.5.1 and if I 
set hadoop version 2.7.1 it will fail to start DFS minicluster for hbase.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236273#comment-15236273
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on the pull request:

https://github.com/apache/phoenix/pull/155#issuecomment-208626268
  
> it's hard to get all those miniclusters run together. I still hope to get 
it resolved. That's why the pom contains dependencies on calcite, hbase-test 
and others. 

If it's just at the test scope, you might be able to circumvent the issue 
by just overriding the hbase and hadoop versions to compatible versions only at 
the test scope.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236014#comment-15236014
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user ss77892 commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59285747
  
--- Diff: phoenix-hive/pom.xml ---
@@ -0,0 +1,224 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  4.0.0
+  
+org.apache.phoenix
+phoenix
+  4.8.0-HBase-1.1-SNAPSHOT
+  
+  phoenix-hive
+  Phoenix - Hive
+
+  
+
+  org.apache.phoenix
+  phoenix-core
+
+
+  org.apache.phoenix
+  phoenix-core
+  tests
+  test
+
+
+  joda-time
+  joda-time
+
+
+  org.apache.hive
+  hive-exec
+  ${hive.version}
+  
+
+  org.apache.calcite
+  *
+
+  
+
+
+  org.apache.calcite
+  calcite-core
+  ${calcite.version}
+
+
+  org.apache.calcite
+  calcite-avatica
+  ${calcite.version}
+
+
+  org.apache.hive
+  hive-common
+  ${hive.version}
+
+
+  org.apache.hive
+  hive-cli
+  ${hive.version}
+
+
+  jline
+  jline
+  0.9.94
+
+
+  commons-lang
+  commons-lang
+  ${commons-lang.version}
+
+
+  commons-logging
+  commons-logging
+  ${commons-logging.version}
+
+
+  org.apache.hbase
+  hbase-testing-util
+  test
+  true
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-it
+  test-jar
+  test
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-common
+
+
+  org.apache.hbase
+  hbase-protocol
+
+
+  org.apache.hbase
+  hbase-client
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test-jar
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test-jar
+  test
+
+
+  org.apache.hadoop
+  hadoop-common
+
+
+  org.apache.hadoop
+  hadoop-annotations
--- End diff --

It will be there as soon as I get miniclusters working for ITs. 


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236013#comment-15236013
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user ss77892 commented on the pull request:

https://github.com/apache/phoenix/pull/155#issuecomment-208572335
  
Actually I want to get some integration tests in the similar way as it was 
done in PHOENIX-331, but there is a clash between versions. This implementation 
uses Hive 1.2.1, so to run Hive minicluster it's require hadoop version >=  
2.6, but hbase artifacts are still using 2.5, so it's hard to get all those 
miniclusters run together. I still hope to get it resolved. That's why the pom 
contains dependencies on calcite, hbase-test and others. 
I will add javadocs and clean commented code where it's possible.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235909#comment-15235909
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on the pull request:

https://github.com/apache/phoenix/pull/155#issuecomment-208542873
  
Some general thoughts (I stopped leaving them inline everytime I saw them). 
I'm guessing you "inherited" some of these from JeongMin's original work.

* Dbl-check indentations
* Try to remove commented out code
* Some class-level javadoc comments would be *amazing*
* Not a single unit test? :)

Other things that I remember biting me previously:

* Make sure you try to run with Tez as well. Both in the "uber" (local job) 
mode and a normal tez task. There are.. subtleties between them, sadly (as 
sadly, I don't remember the specifics anymore).

Other general thoughts:
* The RecordUpdater implementation looks pretty cool. Didn't know they made 
this available for StorageHandlers.
* Hive has a decent suite for running Hive tests as a part of their build 
(which includes tests for StorageHandlers) with this qtest/itest modules. You 
might be able to take some inspiration from these for testing.

Looks good so far. It will be a nice bridge between Phoenix and Hive (as we 
work towards a common-core of Calcite).


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235903#comment-15235903
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59273812
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/ql/index/IndexPredicateAnalyzer.java
 ---
@@ -0,0 +1,486 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive.ql.index;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.Stack;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.hive.ql.exec.FunctionRegistry;
+import org.apache.hadoop.hive.ql.lib.DefaultGraphWalker;
+import org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher;
+import org.apache.hadoop.hive.ql.lib.Dispatcher;
+import org.apache.hadoop.hive.ql.lib.GraphWalker;
+import org.apache.hadoop.hive.ql.lib.Node;
+import org.apache.hadoop.hive.ql.lib.NodeProcessor;
+import org.apache.hadoop.hive.ql.lib.NodeProcessorCtx;
+import org.apache.hadoop.hive.ql.lib.Rule;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeDescUtils;
+import org.apache.hadoop.hive.ql.plan.ExprNodeFieldDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFBaseCompare;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFIn;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNot;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNotNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNull;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToBinary;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToChar;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToDate;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToDecimal;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUtcTimestamp;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDFToVarchar;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
+
+import com.google.common.collect.Lists;
+
+/**
+ * Clone of org.apache.hadoop.hive.ql.index.IndexPredicateAnalyzer with 
modifying
+ * analyzePredicate method.
+ *
+ *
+ */
+public class IndexPredicateAnalyzer {
+
+private static final Log LOG = 
LogFactory.getLog(IndexPredicateAnalyzer.class);
+
+private final Set udfNames;
+private final Map columnToUDFs;
+private FieldValidator fieldValidator;
+
+private boolean acceptsFields;
+
+public IndexPredicateAnalyzer() {
+udfNames = new HashSet();
+columnToUDFs = new HashMap();
+}
+
+public void setFieldValidator(FieldValidator fieldValidator) {
+this.fieldValidator = fieldValidator;
+}
+
+/**
+ * Registers a comparison operator as one which can be satisfied by an 
index
+ * search. Unless this is called, analyzePredicate will never find any
+ * indexable conditions.
+ *
+ * @param udfName name of 

[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235884#comment-15235884
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59272080
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/objectinspector/PhoenixObjectInspectorFactory.java
 ---
@@ -0,0 +1,150 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive.objectinspector;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.hive.serde2.lazy.LazySerDeParameters;
+import 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.ObjectInspectorOptions;
+import org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+
+public class PhoenixObjectInspectorFactory {
--- End diff --

I'm wondering why you need this factory (and the accompanying OI 
implementations for the different types) and you can't reuse the Hive 
ObjectInspector implementations. Something to do with strongly-typed values 
from Phoenix (as opposed to HBase/Accumulo's implementations)?


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235874#comment-15235874
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59271033
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/constants/PhoenixStorageHandlerConstants.java
 ---
@@ -0,0 +1,101 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive.constants;
+
+import java.util.List;
+
+import org.apache.hadoop.io.IntWritable;
+
+import com.google.common.collect.Lists;
+
+/**
+ * Constants using for Hive Storage Handler implementation
+ */
+public class PhoenixStorageHandlerConstants {
+
+   public static final String HBASE_INPUT_FORMAT_CLASS = 
"phoenix.input.format.class";
+
+   public static final String PHOENIX_TABLE_NAME = "phoenix.table.name";
+
+   public static final String DEFAULT_PHOENIX_INPUT_CLASS = 
"com.lgcns.bigdata.platform.hive.hbase.phoenix.mapreduce.PhoenixResultWritable";
+
+   public static final String ZOOKEEPER_QUORUM = 
"phoenix.zookeeper.quorum";
+public static final String ZOOKEEPER_PORT = 
"phoenix.zookeeper.client.port";
+public static final String ZOOKEEPER_PARENT = 
"phoenix.zookeeper.znode.parent";
+public static final String DEFAULT_ZOOKEEPER_QUORUM = "localhost";
+public static final int DEFAULT_ZOOKEEPER_PORT = 2181;
+public static final String DEFAULT_ZOOKEEPER_PARENT = "/hbase";
+
+public static final String PHOENIX_ROWKEYS = "phoenix.rowkeys";
+public static final String PHOENIX_COLUMN_MAPPING = 
"phoenix.column.mapping";
+public static final String PHOENIX_TABLE_OPTIONS = 
"phoenix.table.options";
+
+public static final String PHOENIX_TABLE_QUERY_HINT = ".query.hint";
+public static final String PHOENIX_REDUCER_NUMBER = ".reducer.count";
+public static final String DISABLE_WAL = ".disable.wal";
+public static final String BATCH_MODE = "batch.mode";
+public static final String AUTO_FLUSH = ".auto.flush";
+
+public static final String COLON = ":";
+   public static final String COMMA = ",";
+   public static final String EMPTY_STRING = "";
+   public static final String SPACE = " ";
+   public static final String LEFT_ROUND_BRACKET = "(";
+   public static final String RIGHT_ROUND_BRACKET = ")";
+   public static final String QUOTATION_MARK = "'";
+   public static final String EQUAL = "=";
+   public static final String IS = "is";
+   public static final String QUESTION = "?";
+
+   public static final String SPLIT_BY_STATS = "split.by.stats";
+   public static final String HBASE_SCAN_CACHE = "hbase.scan.cache";
+   public static final String HBASE_SCAN_CACHEBLOCKS = 
"hbase.scan.cacheblock";
+   public static final String HBASE_DATE_FORMAT = "hbase.date.format";
+   public static final String HBASE_TIMESTAMP_FORMAT = 
"hbase.timestamp.format";
+   public static final String DEFAULT_DATE_FORMAT = "-MM-dd";
+   public static final String DEFAULT_TIMESTAMP_FORMAT = "-MM-dd 
HH:mm:ss.SSS";
+
+   public static final String IN_OUT_WORK = "in.out.work";
+   public static final String IN_WORK = "input";
+   public static final String OUT_WORK = "output";
+
+   public static final String MR = "mr";
+   public static final String TEZ = "tez";
+   public static final String SPARK = "spark";
+
+   public static final String DATE_TYPE = "date";
+   public static final String TIMESTAMP_TYPE = "timestamp";
+   public static final String BETWEEN_COMPARATOR = "between";
+   public static final String IN_COMPARATOR = "in";
+   public static final List COMMON_COMPARATOR = 
Lists.newArrayList("=", "<", ">", "<=", ">=");
+
+   // date/timestamp 타입 컬럼의 쿼리 조건 변환
--- End diff 

[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235851#comment-15235851
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59269847
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java ---
@@ -0,0 +1,245 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive;
+
+import java.sql.Connection;
+import java.sql.SQLException;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.hive.metastore.HiveMetaHook;
+import org.apache.hadoop.hive.metastore.TableType;
+import org.apache.hadoop.hive.metastore.api.FieldSchema;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.api.Table;
+import org.apache.phoenix.hive.constants.PhoenixStorageHandlerConstants;
+import org.apache.phoenix.hive.util.PhoenixConnectionUtil;
+import org.apache.phoenix.hive.util.PhoenixStorageHandlerUtil;
+import org.apache.phoenix.hive.util.PhoenixUtil;
+
+import com.google.common.base.CharMatcher;
+import com.google.common.base.Splitter;
+import com.google.common.collect.Lists;
+
+public class PhoenixMetaHook implements HiveMetaHook {
+
+private static final Log LOG = 
LogFactory.getLog(PhoenixMetaHook.class);
+
+@Override
+public void preCreateTable(Table table) throws MetaException {
+if (LOG.isDebugEnabled()) {
+LOG.debug("Precreate  table : " + table.getTableName());
+}
+
+try (Connection conn = PhoenixConnectionUtil.getConnection(table)) 
{
+String tableType = table.getTableType();
+String tableName = 
PhoenixStorageHandlerUtil.getTargetTableName(table);
+
+if (TableType.EXTERNAL_TABLE.name().equals(tableType)) {
+// Check whether phoenix table exists.
+if (!PhoenixUtil.existTable(conn, tableName)) {
+// Error if phoenix table not exist.
+throw new MetaException("Phoenix table " + tableName + 
" doesn't exist");
+}
+} else if (TableType.MANAGED_TABLE.name().equals(tableType)) {
+// Check whether phoenix table exists.
+if (PhoenixUtil.existTable(conn, tableName)) {
+// Error if phoenix table already exist.
+throw new MetaException("Phoenix table " + tableName + 
" already exist.");
+}
+
+PhoenixUtil.createTable(conn, createTableStatement(table));
+} else {
+throw new MetaException("Unsupported table Type: " + 
table.getTableType());
+}
+
+if (LOG.isDebugEnabled()) {
+LOG.debug("Phoenix table " + tableName + " was created");
+}
+} catch (SQLException e) {
+throw new MetaException(e.getMessage());
+}
+}
+
+private String createTableStatement(Table table) throws MetaException {
+Map tableParameterMap = table.getParameters();
+
+String tableName = 
PhoenixStorageHandlerUtil.getTargetTableName(table);
+StringBuilder ddl = new StringBuilder("create table 
").append(tableName).append(" (\n");
+
+String phoenixRowKeys = 
tableParameterMap.get(PhoenixStorageHandlerConstants
+.PHOENIX_ROWKEYS);
+StringBuilder realRowKeys = new StringBuilder();
+List phoenixRowKeyList = Lists.newArrayList(Splitter.on
+

[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235859#comment-15235859
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59270355
  
--- Diff: 
phoenix-hive/src/main/java/org/apache/phoenix/hive/PhoenixRowKey.java ---
@@ -0,0 +1,69 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.hive;
+
+import com.google.common.collect.Maps;
+import org.apache.hadoop.hive.ql.io.RecordIdentifier;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.ObjectInputStream;
+import java.io.ObjectOutputStream;
+import java.io.OutputStream;
+import java.util.Map;
+
+public class PhoenixRowKey extends RecordIdentifier {
+
+   private Map rowKeyMap = Maps.newHashMap();
+
+   public PhoenixRowKey() {
+
+   }
+
+   //  public void add(String columnName, Object value) {
--- End diff --

Dead code


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235842#comment-15235842
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59269404
  
--- Diff: phoenix-hive/pom.xml ---
@@ -0,0 +1,224 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  4.0.0
+  
+org.apache.phoenix
+phoenix
+  4.8.0-HBase-1.1-SNAPSHOT
+  
+  phoenix-hive
+  Phoenix - Hive
+
+  
+
+  org.apache.phoenix
+  phoenix-core
+
+
+  org.apache.phoenix
+  phoenix-core
+  tests
+  test
+
+
+  joda-time
+  joda-time
+
+
+  org.apache.hive
+  hive-exec
+  ${hive.version}
+  
+
+  org.apache.calcite
+  *
+
+  
+
+
+  org.apache.calcite
+  calcite-core
+  ${calcite.version}
+
+
+  org.apache.calcite
+  calcite-avatica
+  ${calcite.version}
+
+
+  org.apache.hive
+  hive-common
+  ${hive.version}
+
+
+  org.apache.hive
+  hive-cli
+  ${hive.version}
+
+
+  jline
+  jline
+  0.9.94
+
+
+  commons-lang
+  commons-lang
+  ${commons-lang.version}
+
+
+  commons-logging
+  commons-logging
+  ${commons-logging.version}
+
+
+  org.apache.hbase
+  hbase-testing-util
+  test
+  true
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-it
+  test-jar
+  test
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-common
+
+
+  org.apache.hbase
+  hbase-protocol
+
+
+  org.apache.hbase
+  hbase-client
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test-jar
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test-jar
+  test
+
+
+  org.apache.hadoop
+  hadoop-common
+
+
+  org.apache.hadoop
+  hadoop-annotations
+
+
+  org.apache.hadoop
+  hadoop-mapreduce-client-core
+
+
+  org.apache.hadoop
+  hadoop-minicluster
+
+
+
+  org.mockito
+  mockito-all
+  test
+
+
+  junit
+  junit
+  test
+
+  
+
+  
+
+  
+org.codehaus.mojo
+build-helper-maven-plugin
+  
+  
+org.apache.maven.plugins
+maven-failsafe-plugin
+  
+  
+maven-dependency-plugin
+${maven-dependency-plugin.version}
+
--- End diff --

Wonky indentation here. Tabs maybe?


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my 

[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235841#comment-15235841
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59269345
  
--- Diff: phoenix-hive/pom.xml ---
@@ -0,0 +1,224 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  4.0.0
+  
+org.apache.phoenix
+phoenix
+  4.8.0-HBase-1.1-SNAPSHOT
+  
+  phoenix-hive
+  Phoenix - Hive
+
+  
+
+  org.apache.phoenix
+  phoenix-core
+
+
+  org.apache.phoenix
+  phoenix-core
+  tests
+  test
+
+
+  joda-time
+  joda-time
+
+
+  org.apache.hive
+  hive-exec
+  ${hive.version}
+  
+
+  org.apache.calcite
+  *
+
+  
+
+
+  org.apache.calcite
+  calcite-core
+  ${calcite.version}
+
+
+  org.apache.calcite
+  calcite-avatica
+  ${calcite.version}
+
+
+  org.apache.hive
+  hive-common
+  ${hive.version}
+
+
+  org.apache.hive
+  hive-cli
+  ${hive.version}
+
+
+  jline
+  jline
+  0.9.94
+
+
+  commons-lang
+  commons-lang
+  ${commons-lang.version}
+
+
+  commons-logging
+  commons-logging
+  ${commons-logging.version}
+
+
+  org.apache.hbase
+  hbase-testing-util
+  test
+  true
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-it
+  test-jar
+  test
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-common
+
+
+  org.apache.hbase
+  hbase-protocol
+
+
+  org.apache.hbase
+  hbase-client
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test-jar
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test-jar
+  test
+
+
+  org.apache.hadoop
+  hadoop-common
+
+
+  org.apache.hadoop
+  hadoop-annotations
+
+
+  org.apache.hadoop
+  hadoop-mapreduce-client-core
+
+
+  org.apache.hadoop
+  hadoop-minicluster
+
+
+
+  org.mockito
+  mockito-all
+  test
+
+
+  junit
+  junit
+  test
+
+  
+
+  
+
+  
+org.codehaus.mojo
+build-helper-maven-plugin
+  
+  
+org.apache.maven.plugins
+maven-failsafe-plugin
+  
+  
+maven-dependency-plugin
+${maven-dependency-plugin.version}
--- End diff --

Likely unnecessary version specification


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--

[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235840#comment-15235840
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59269299
  
--- Diff: phoenix-hive/pom.xml ---
@@ -0,0 +1,224 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  4.0.0
+  
+org.apache.phoenix
+phoenix
+  4.8.0-HBase-1.1-SNAPSHOT
+  
+  phoenix-hive
+  Phoenix - Hive
+
+  
+
+  org.apache.phoenix
+  phoenix-core
+
+
+  org.apache.phoenix
+  phoenix-core
+  tests
+  test
+
+
+  joda-time
+  joda-time
+
+
+  org.apache.hive
+  hive-exec
+  ${hive.version}
+  
+
+  org.apache.calcite
+  *
+
+  
+
+
+  org.apache.calcite
+  calcite-core
+  ${calcite.version}
+
+
+  org.apache.calcite
+  calcite-avatica
+  ${calcite.version}
+
+
+  org.apache.hive
+  hive-common
+  ${hive.version}
+
+
+  org.apache.hive
+  hive-cli
+  ${hive.version}
+
+
+  jline
+  jline
+  0.9.94
+
+
+  commons-lang
+  commons-lang
+  ${commons-lang.version}
+
+
+  commons-logging
+  commons-logging
+  ${commons-logging.version}
+
+
+  org.apache.hbase
+  hbase-testing-util
+  test
+  true
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-it
+  test-jar
+  test
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-common
+
+
+  org.apache.hbase
+  hbase-protocol
+
+
+  org.apache.hbase
+  hbase-client
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test-jar
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test-jar
+  test
+
+
+  org.apache.hadoop
+  hadoop-common
+
+
+  org.apache.hadoop
+  hadoop-annotations
+
+
+  org.apache.hadoop
+  hadoop-mapreduce-client-core
+
+
+  org.apache.hadoop
+  hadoop-minicluster
--- End diff --

Should this be test scope?


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235839#comment-15235839
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

Github user joshelser commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/155#discussion_r59269276
  
--- Diff: phoenix-hive/pom.xml ---
@@ -0,0 +1,224 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  4.0.0
+  
+org.apache.phoenix
+phoenix
+  4.8.0-HBase-1.1-SNAPSHOT
+  
+  phoenix-hive
+  Phoenix - Hive
+
+  
+
+  org.apache.phoenix
+  phoenix-core
+
+
+  org.apache.phoenix
+  phoenix-core
+  tests
+  test
+
+
+  joda-time
+  joda-time
+
+
+  org.apache.hive
+  hive-exec
+  ${hive.version}
+  
+
+  org.apache.calcite
+  *
+
+  
+
+
+  org.apache.calcite
+  calcite-core
+  ${calcite.version}
+
+
+  org.apache.calcite
+  calcite-avatica
+  ${calcite.version}
+
+
+  org.apache.hive
+  hive-common
+  ${hive.version}
+
+
+  org.apache.hive
+  hive-cli
+  ${hive.version}
+
+
+  jline
+  jline
+  0.9.94
+
+
+  commons-lang
+  commons-lang
+  ${commons-lang.version}
+
+
+  commons-logging
+  commons-logging
+  ${commons-logging.version}
+
+
+  org.apache.hbase
+  hbase-testing-util
+  test
+  true
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-it
+  test-jar
+  test
+  
+
+  org.jruby
+  jruby-complete
+
+  
+
+
+  org.apache.hbase
+  hbase-common
+
+
+  org.apache.hbase
+  hbase-protocol
+
+
+  org.apache.hbase
+  hbase-client
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop-compat
+  test-jar
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test
+
+
+  org.apache.hbase
+  hbase-hadoop2-compat
+  test-jar
+  test
+
+
+  org.apache.hadoop
+  hadoop-common
+
+
+  org.apache.hadoop
+  hadoop-annotations
--- End diff --

I don't see this actively referenced in your changes.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-04-07 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231586#comment-15231586
 ] 

James Taylor commented on PHOENIX-2743:
---

Is this ready to go, [~sergey.soldatov]? How's the test coverage look? Should 
we dup out PHOENIX-331 or is that complimentary? Mind taking a look at the PR, 
[~maghamravikiran]?

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213837#comment-15213837
 ] 

ASF GitHub Bot commented on PHOENIX-2743:
-

GitHub user ss77892 opened a pull request:

https://github.com/apache/phoenix/pull/155

PHOENIX-2743 Hive Storage support



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ss77892/phoenix PHOENIX-2743

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/phoenix/pull/155.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #155


commit 2cb80a71520fd896e5530ac77a2c3c4b17d6f8e2
Author: Sergey Soldatov 
Date:   2016-03-28T05:42:17Z

PHOENIX-2743 Hive Storage support




> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
> Attachments: PHOENIX-2743-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-15 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196616#comment-15196616
 ] 

James Taylor commented on PHOENIX-2743:
---

[~warwithin] - yes, it's a dup, but both have implementations against them. 
Hopefully they'll talk and we'll get the best of both worlds. :-)

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-15 Thread YoungWoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196603#comment-15196603
 ] 

YoungWoo Kim commented on PHOENIX-2743:
---

Dup of PHOENIX-331?

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-11 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15191062#comment-15191062
 ] 

James Taylor commented on PHOENIX-2743:
---

[~mini666] - with our 4.7 release, the validation check done by MutationState 
is only done once per call to commit. You can also prevent the RPC we do to 
check if the client has the current metadata through the UPDATE_CACHE_FREQUENCY 
property (see https://phoenix.apache.org/language/index.html#options). Please 
let me know if there's still an issue with 4.7.0 release - otherwise let's come 
up with some other solution to prevents having to change MutationState. 

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-11 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190653#comment-15190653
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

It would be really nice since there are a lot of people who are still using 
older versions of Hive and who may be interested in this functionality.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-11 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190651#comment-15190651
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

It would be really nice since there are a lot of people who are still using 
older versions of Hive and who may be interested in this functionality.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-10 Thread JeongMin Ju (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190388#comment-15190388
 ] 

JeongMin Ju commented on PHOENIX-2743:
--

Oh, you right. my mistake.
LazyObjectInspectorParameters, LazySerDeParameters classes were created from 
Hive 1.2.
For now, it is supported above Hive 1.2.
To enable from a previous version additional work is required, if necessary.


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-10 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190322#comment-15190322
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

Hi JeongMin,

Yep, I took a look at the sources, tried to run it with old version of Hive 
(1.0.0) and got number of ClassNotFound exceptions.  
1. LazyObjectInspectorParameters as well as the rest classes I mentioned were 
introduced in Hive 1.2.0. How it can work with previous versions? That's 
exactly I was asking about. Just try to compile only phoenix related classes 
with older hive. 
2. Thanks. 
3. Oops. I just wanted to say that it would be nice to have comments  that can 
be understood by others. 


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-10 Thread JeongMin Ju (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190273#comment-15190273
 ] 

JeongMin Ju commented on PHOENIX-2743:
--

Hello Sergey.
Do you examine and execute what i posted on github?
The support functions mentioned there are all true. and it is applied to my 
company system.

1. Yes, it can. I checked Hive from 0.13 to 2.0 only eye check not execution. I 
found that it is possible to apply until 0.14 in 2.0.
Hive 0.13 will require more changes. Even from Hive 0.14 to 2.0 should reflect 
only the changed code, rather than overwrite files.
Can do to avoid being in a class Explain level I can not understand the 
question. What does it mean? 
LazyObjectInspectorParameters, LazySerDeParameters class is used.

2. Thank you for your suggestion. I hope more people will be using Phoenix with 
this function.
I will open the Hive Jira follow your advice.

3. It's a joke? There is no doubt that you will not know the Chinese language. 
It is a Korean language.
Does it mean you could not test because of comment written in korean.
OK, I am willing to change it to english.



> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-09 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188464#comment-15188464
 ] 

Sergey Soldatov commented on PHOENIX-2743:
--

Well, I have some concerns about it. 
1. Can it support Hive versions prior 1.2 ? So, is it possible to avoid using  
of LazyObjectInspectorParameters, LazySerDeParameters, Explain level, 
ReflectionUtil and similar recently added classes?
2. It will be nice to have it as a module in Phoenix. That means no hiding hive 
classes at all. If hive misses some feature, that supposed to be contributed as 
a separate JIRA into Hive.
3. Logging, comments in Chinese, no tests
   


> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-06 Thread JeongMin Ju (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182590#comment-15182590
 ] 

JeongMin Ju commented on PHOENIX-2743:
--

I completed publish source code and usage to github.

https://github.com/mini666/hive-phoenix-handler



> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-06 Thread JeongMin Ju (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182453#comment-15182453
 ] 

JeongMin Ju commented on PHOENIX-2743:
--

Thange for reply.
The change of MutationState is to disable validation check for batch upsert.
The current code of MutationState.validate is very inefficient because of 
checking every row.
If turn off the validation when batch upsert. Performance is improved.

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-04 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180439#comment-15180439
 ] 

Devaraj Das commented on PHOENIX-2743:
--

Definitely is related .. FYI [~nmaillard]

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2743) HivePhoenixHandler for big-big join with predicate push down

2016-03-04 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180434#comment-15180434
 ] 

James Taylor commented on PHOENIX-2743:
---

[~mini666] - thank you so much for letting us know about this very interesting 
work. You should collaborate with [~sergey.soldatov] on PHOENIX-331. How does 
this differ? Your's is more about being able to use HiveQL?

What changes did you need to make to MutationState? Would be interesting to see 
how our 4.7.0 release will impact your work, as we now support transactions.

FYI, [~devaraj], [~enis].

> HivePhoenixHandler for big-big join with predicate push down
> 
>
> Key: PHOENIX-2743
> URL: https://issues.apache.org/jira/browse/PHOENIX-2743
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.5.0, 4.6.0
> Environment: hive-1.2.1
>Reporter: JeongMin Ju
>  Labels: features, performance
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Phoenix support hash join & sort-merge join. But in case of big*big join does 
> not process well.
> Therefore Need other method like Hive.
> I implemented hive-phoenix-handler that can access Apache Phoenix table on 
> HBase using HiveQL.
> hive-phoenix-handler is very faster than hive-hbase-handler because of 
> applying predicate push down.
> I am publishing source code to github for contribution and maybe will be 
> completed by next week.
> https://github.com/mini666/hive-phoenix-handler
> please, review my proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)