date:20140609

[jira] [Commented] (HIVE-7199) Cannot alter table to parquet

2014-06-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026165#comment-14026165
 ] 

Hive QA commented on HIVE-7199:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12649433/HIVE-7199.patch

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5608 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/420/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/420/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-420/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12649433

> Cannot alter table to parquet
> -
>
> Key: HIVE-7199
> URL: https://issues.apache.org/jira/browse/HIVE-7199
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Vasanth kumar RJ
>Assignee: Vasanth kumar RJ
> Fix For: 0.14.0
>
> Attachments: HIVE-7199.patch
>
>
> Cannot able to alter a table to parquet.
> >alter table t1 set fileformat parquet;
> Then cannot able to query to the table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7158:
-

Attachment: HIVE-7158.4.patch

.4 sets the lower bound to Math.max(1, estimate * min_factor)

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch, HIVE-7158.3.patch, 
> HIVE-7158.4.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7158:
-

Status: Patch Available  (was: Open)

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch, HIVE-7158.3.patch, 
> HIVE-7158.4.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7158:
-

Status: Open  (was: Patch Available)

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch, HIVE-7158.3.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026151#comment-14026151
 ] 

Gunther Hagleitner commented on HIVE-7158:
--

Spoke to [~gopalv] offline. 0 isn't going to work. Apparently tez could skip 
the reduce stage altogether if that occurs. Which would break some queries.

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch, HIVE-7158.3.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026139#comment-14026139
 ] 

Gunther Hagleitner commented on HIVE-7158:
--

Also [~sseth]/[~gopalv] I've changed the minReducers to be factor * estimate 
(removed the "+ 1" as suggested by Sid. Which means you might get a 0 as the 
lower bound. Is this going to be a problem? (and is this the same as TEZ-1163 
or no?)

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch, HIVE-7158.3.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026134#comment-14026134
 ] 

Gunther Hagleitner commented on HIVE-7158:
--

[~sseth] can you elaborate on why overriding 
tez.am.shuffle-vertex-manager.min-src-fraction in hive might be a good idea? 
the rest of the review comments i took care of.

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch, HIVE-7158.3.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7158:
-

Status: Patch Available  (was: Open)

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch, HIVE-7158.3.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7158:
-

Attachment: HIVE-7158.3.patch

.3 addresses review comments

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch, HIVE-7158.3.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7158) Use Tez auto-parallelism in Hive

2014-06-09 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7158:
-

Status: Open  (was: Patch Available)

> Use Tez auto-parallelism in Hive
> 
>
> Key: HIVE-7158
> URL: https://issues.apache.org/jira/browse/HIVE-7158
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7158.1.patch, HIVE-7158.2.patch
>
>
> Tez can optionally sample data from a fraction of the tasks of a vertex and 
> use that information to choose the number of downstream tasks for any given 
> scatter gather edge.
> Hive estimates the count of reducers by looking at stats and estimates for 
> each operator in the operator pipeline leading up to the reducer. However, if 
> this estimate turns out to be too large, Tez can reign in the resources used 
> to compute the reducer.
> It does so by combining partitions of the upstream vertex. It cannot, 
> however, add reducers at this stage.
> I'm proposing to let users specify whether they want to use auto-parallelism 
> or not. If they do there will be scaling factors to determine max and min 
> reducers Tez can choose from. We will then partition by max reducers, letting 
> Tez sample and reign in the count up until the specified min.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-2365) SQL support for bulk load into HBase

2014-06-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026097#comment-14026097
 ] 

Hive QA commented on HIVE-2365:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12648785/HIVE-2365.3.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5536 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_char
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/419/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/419/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-419/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12648785

> SQL support for bulk load into HBase
> 
>
> Key: HIVE-2365
> URL: https://issues.apache.org/jira/browse/HIVE-2365
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: John Sichi
>Assignee: Nick Dimiduk
> Fix For: 0.14.0
>
> Attachments: HIVE-2365.2.patch.txt, HIVE-2365.3.patch, 
> HIVE-2365.3.patch, HIVE-2365.WIP.00.patch, HIVE-2365.WIP.01.patch, 
> HIVE-2365.WIP.01.patch
>
>
> Support the "as simple as this" SQL for bulk load from Hive into HBase.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6993) Update hive for Tez VertexLocationHint and getAVailableResource API changes

2014-06-09 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-6993:
--

Attachment: HIVE-6993.2.patch

Rebase to trunk.

> Update hive for Tez VertexLocationHint and getAVailableResource API changes
> ---
>
> Key: HIVE-6993
> URL: https://issues.apache.org/jira/browse/HIVE-6993
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.13.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-6993.1.patch, HIVE-6993.2.patch
>
>
> Tez had two breaking changes in API between 0.4.x -> 0.5.x.
> context.getTotalAvailableResource().getMemory(); 
> and
>  context.scheduleVertexTasks(List< TaskWithLocationHint>);



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7204) Use NULL vertex location hint for Prewarm DAG vertices

2014-06-09 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7204:
--

Attachment: HIVE-7204.1.patch

> Use NULL vertex location hint for Prewarm DAG vertices
> --
>
> Key: HIVE-7204
> URL: https://issues.apache.org/jira/browse/HIVE-7204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Attachments: HIVE-7204.1.patch
>
>
> The current 0.5.x branch of Tez added extra preconditions which check for 
> parallelism settings to match between the number of containers and the vertex 
> location hints.
> {code}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.lang.IllegalArgumentException): 
> Locations array length must match the parallelism set for the vertex
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
> at org.apache.tez.dag.api.Vertex.setTaskLocationsHint(Vertex.java:105)
> at 
> org.apache.tez.dag.app.DAGAppMaster.startPreWarmContainers(DAGAppMaster.java:1004)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7204) Use NULL vertex location hint for Prewarm DAG vertices

2014-06-09 Thread Gopal V (JIRA)

Gopal V created HIVE-7204:
-

 Summary: Use NULL vertex location hint for Prewarm DAG vertices
 Key: HIVE-7204
 URL: https://issues.apache.org/jira/browse/HIVE-7204
 Project: Hive
  Issue Type: Sub-task
  Components: Tez
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor


The current 0.5.x branch of Tez added extra preconditions which check for 
parallelism settings to match between the number of containers and the vertex 
location hints.

{code}
Caused by: 
org.apache.hadoop.ipc.RemoteException(java.lang.IllegalArgumentException): 
Locations array length must match the parallelism set for the vertex
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at org.apache.tez.dag.api.Vertex.setTaskLocationsHint(Vertex.java:105)
at 
org.apache.tez.dag.app.DAGAppMaster.startPreWarmContainers(DAGAppMaster.java:1004)
{code}




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7140) Bump default hive.metastore.client.socket.timeout to 5 minutes

2014-06-09 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026082#comment-14026082
 ] 

Lefty Leverenz commented on HIVE-7140:
--

Good question, [~brocknoland].  We've used release notes for doc tasks 
previously, and that seems better than labels.  To make it work we should have 
a standard phrase which everyone remembers to use, but I'm not confident about 
compliance so I've been flagging email messages "to-doc-14" in gmail for a 
personal list.

Hm, TO-DOC-14 might be a good phrase for the release notes.

Another approach would be to create a Jira ticket where we can list release 
0.14's doc tasks in the comments.  That seems unnecessarily complicated, 
though.  Do you have any other suggestions?

I'll try to find the previous discussion on this topic.  [~thejas] was a 
participant, as I recall.

> Bump default hive.metastore.client.socket.timeout to 5 minutes
> --
>
> Key: HIVE-7140
> URL: https://issues.apache.org/jira/browse/HIVE-7140
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0, 0.12.0, 0.13.0
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 0.14.0
>
> Attachments: HIVE-7140.patch
>
>
> The issue is that OOTB clients often face timeouts when using HMS since many 
> operations in the HMS completes are long running (e.g. many operations on a 
> table with many partitions). A few supporting pieces of information:
> * The default value of hive.metastore.client.socket.timeout is 20 seconds.
> * Since the timeout is client only, the server happy continues doing the 
> requested work
> * Clients retry after a small delay to perform the requested work again, 
> often while the server is still trying to complete the original request
> * A few tests have actually increased this value in order to pass reliably.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7079) Hive logs errors about missing tables when parsing CTE expressions

2014-06-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026079#comment-14026079
 ] 

Ashutosh Chauhan commented on HIVE-7079:


Can you create RB request for it?

> Hive logs errors about missing tables when parsing CTE expressions
> --
>
> Key: HIVE-7079
> URL: https://issues.apache.org/jira/browse/HIVE-7079
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.0
>Reporter: Craig Condit
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7079.1.patch.txt
>
>
> Given a query containing common table expressions (CTE) such as:
> WITH a AS (SELECT ...), b AS (SELECT ...)
> SELECT * FROM a JOIN b on a.col = b.col ...;
> Hive CLI executes the query, but logs stack traces at ERROR level during 
> query parsing:
> {noformat}
> ERROR metadata.Hive: NoSuchObjectException(message:ccondit.a table not found)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29338)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29306)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result.read(ThriftHiveMetastore.java:29237)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1036)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1022)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>   at com.sun.proxy.$Proxy7.getTable(Unknown Source)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:967)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:909)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1223)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1192)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9209)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:391)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:291)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:944)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1009)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {noformat}
> It looks like Hive is attempting to resolve the CTE aliases as physical 
> tables.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7174) Do not accept string as scale and precision when reading Avro schema

2014-06-09 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-7174:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks Jarcec for the contribution.

> Do not accept string as scale and precision when reading Avro schema
> 
>
> Key: HIVE-7174
> URL: https://issues.apache.org/jira/browse/HIVE-7174
> Project: Hive
>  Issue Type: Bug
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 0.14.0
>
> Attachments: HIVE-7174.patch, dec.avro
>
>
> I've noticed that the current AvroSerde will happily accept schema that uses 
> string instead of integer for scale and precision, e.g. fragment 
> {{"precision":"4","scale":"1"}} from following table:
> {code}
> CREATE TABLE `avro_dec1`(
>   `name` string COMMENT 'from deserializer',
>   `value` decimal(4,1) COMMENT 'from deserializer')
> COMMENT 'just drop the schema right into the HQL'
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> TBLPROPERTIES (
>   'numFiles'='1',
>   
> 'avro.schema.literal'='{\"namespace\":\"com.howdy\",\"name\":\"some_schema\",\"type\":\"record\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"value\",\"type\":{\"type\":\"bytes\",\"logicalType\":\"decimal\",\"precision\":\"4\",\"scale\":\"1\"}}]}'
> );
> {code}
> However the Decimal spec defined in AVRO-1402 requires only integer to be 
> there and hence is allowing only following fragment instead 
> {{"precision":4,"scale":1}} (e.g. no double quotes around numbers).
> As Hive can propagate this incorrect schema to new files and hence creating 
> files with invalid schema, I think that we should alter the behavior and 
> insist on the correct schema.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7111) Extend join transitivity PPD to non-column expressions

2014-06-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026070#comment-14026070
 ] 

Ashutosh Chauhan commented on HIVE-7111:


Can you create a RB request for it?

> Extend join transitivity PPD to non-column expressions
> --
>
> Key: HIVE-7111
> URL: https://issues.apache.org/jira/browse/HIVE-7111
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt
>
>
> Join transitive in PPD only supports column expressions, but it's possible to 
> extend this to generic expressions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-7074) The reducer parallelism should be a prime number for better stride protection

2014-06-09 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V resolved HIVE-7074.
---

Resolution: Not a Problem

> The reducer parallelism should be a prime number for better stride protection
> -
>
> Key: HIVE-7074
> URL: https://issues.apache.org/jira/browse/HIVE-7074
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7074.1.patch
>
>
> The current hive reducer parallelism results in stride issues with key 
> distribution.
> a JOIN generating even numbers will get strided onto only some of the 
> reducers.
> The probability of distribution skew is controlled by the number of common 
> factors shared by the hashcode of the key and the number of buckets.
> Using a prime number within the reducer estimation will cut that probability 
> down by a significant amount.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7121) Use murmur hash to distribute HiveKey

2014-06-09 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7121:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~gopalv]!

> Use murmur hash to distribute HiveKey
> -
>
> Key: HIVE-7121
> URL: https://issues.apache.org/jira/browse/HIVE-7121
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7121.1.patch, HIVE-7121.2.patch, HIVE-7121.3.patch, 
> HIVE-7121.WIP.patch
>
>
> The current hashCode implementation produces poor parallelism when dealing 
> with single integers or doubles.
> And for partitioned inserts into a 1 bucket table, there is a significant 
> hotspot on Reducer #31.
> Removing the magic number 31 and using a more normal hash algorithm would 
> help fix these hotspots.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7121) Use murmur hash to distribute HiveKey

2014-06-09 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026048#comment-14026048
 ] 

Gunther Hagleitner commented on HIVE-7121:
--

Believe so. I've run all the failed tests locally and didn't see any new errors.

> Use murmur hash to distribute HiveKey
> -
>
> Key: HIVE-7121
> URL: https://issues.apache.org/jira/browse/HIVE-7121
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7121.1.patch, HIVE-7121.2.patch, HIVE-7121.3.patch, 
> HIVE-7121.WIP.patch
>
>
> The current hashCode implementation produces poor parallelism when dealing 
> with single integers or doubles.
> And for partitioned inserts into a 1 bucket table, there is a significant 
> hotspot on Reducer #31.
> Removing the magic number 31 and using a more normal hash algorithm would 
> help fix these hotspots.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7174) Do not accept string as scale and precision when reading Avro schema

2014-06-09 Thread Jarek Jarcec Cecho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026028#comment-14026028
 ] 

Jarek Jarcec Cecho commented on HIVE-7174:
--

Are you going to commit the changes [~xuefuz]?

> Do not accept string as scale and precision when reading Avro schema
> 
>
> Key: HIVE-7174
> URL: https://issues.apache.org/jira/browse/HIVE-7174
> Project: Hive
>  Issue Type: Bug
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 0.14.0
>
> Attachments: HIVE-7174.patch, dec.avro
>
>
> I've noticed that the current AvroSerde will happily accept schema that uses 
> string instead of integer for scale and precision, e.g. fragment 
> {{"precision":"4","scale":"1"}} from following table:
> {code}
> CREATE TABLE `avro_dec1`(
>   `name` string COMMENT 'from deserializer',
>   `value` decimal(4,1) COMMENT 'from deserializer')
> COMMENT 'just drop the schema right into the HQL'
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> TBLPROPERTIES (
>   'numFiles'='1',
>   
> 'avro.schema.literal'='{\"namespace\":\"com.howdy\",\"name\":\"some_schema\",\"type\":\"record\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"value\",\"type\":{\"type\":\"bytes\",\"logicalType\":\"decimal\",\"precision\":\"4\",\"scale\":\"1\"}}]}'
> );
> {code}
> However the Decimal spec defined in AVRO-1402 requires only integer to be 
> there and hence is allowing only following fragment instead 
> {{"precision":4,"scale":1}} (e.g. no double quotes around numbers).
> As Hive can propagate this incorrect schema to new files and hence creating 
> files with invalid schema, I think that we should alter the behavior and 
> insist on the correct schema.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7111) Extend join transitivity PPD to non-column expressions

2014-06-09 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7111:


Attachment: HIVE-7111.2.patch.txt

> Extend join transitivity PPD to non-column expressions
> --
>
> Key: HIVE-7111
> URL: https://issues.apache.org/jira/browse/HIVE-7111
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt
>
>
> Join transitive in PPD only supports column expressions, but it's possible to 
> extend this to generic expressions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-3508) create cube and rollup operators in hive without mapside aggregation

2014-06-09 Thread Teruyoshi Zenmyo (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teruyoshi Zenmyo updated HIVE-3508:
---

Status: Patch Available  (was: Open)

> create cube and rollup operators in hive without mapside aggregation
> 
>
> Key: HIVE-3508
> URL: https://issues.apache.org/jira/browse/HIVE-3508
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
> Attachments: HIVE-3508.2.patch.txt, HIVE-3508.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 22372: HIVE-3508 : create cube and rollup operators in hive without mapside aggregation

2014-06-09 Thread Teruyoshi Zenmyo


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22372/
---

(Updated June 10, 2014, 1:36 a.m.)


Review request for hive.


Changes
---

The patch is recreated because the previous one was not created correctly, and 
arguments in invocations of genGroupByPlanReduceSinkOperator() are fixed.


Bugs: HIVE-3508
https://issues.apache.org/jira/browse/HIVE-3508


Repository: hive-git


Description
---

This patch
adds new operator (GroupMultiplexOperator)
modifies SemanticAnalyzer to insert the operator.
adds two qtests


Diffs (updated)
-

  
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
 aa094ee 
  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 8ae1c73 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupMultiplexOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 5d41fa1 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 5fad971 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
2a8fb2b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 6cdaedb 
  ql/src/java/org/apache/hadoop/hive/ql/plan/GroupMultiplexDesc.java 
PRE-CREATION 
  
ql/src/test/queries/clientpositive/groupby_grouping_sets_skew_without_mapaggr.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/groupby_grouping_sets_without_mapaggr.q 
PRE-CREATION 
  
ql/src/test/results/clientpositive/groupby_grouping_sets_skew_without_mapaggr.q.out
 PRE-CREATION 
  
ql/src/test/results/clientpositive/groupby_grouping_sets_without_mapaggr.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/22372/diff/


Testing
---

tested in local mode.


Thanks,

Teruyoshi Zenmyo

[jira] [Updated] (HIVE-3508) create cube and rollup operators in hive without mapside aggregation

2014-06-09 Thread Teruyoshi Zenmyo (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teruyoshi Zenmyo updated HIVE-3508:
---

Attachment: HIVE-3508.2.patch.txt

The patch is recreated because the previous one was not created correctly, and 
arguments in invocations of genGroupByPlanReduceSinkOperator() are fixed.

> create cube and rollup operators in hive without mapside aggregation
> 
>
> Key: HIVE-3508
> URL: https://issues.apache.org/jira/browse/HIVE-3508
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
> Attachments: HIVE-3508.2.patch.txt, HIVE-3508.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-3508) create cube and rollup operators in hive without mapside aggregation

2014-06-09 Thread Teruyoshi Zenmyo (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teruyoshi Zenmyo updated HIVE-3508:
---

Status: Open  (was: Patch Available)

> create cube and rollup operators in hive without mapside aggregation
> 
>
> Key: HIVE-3508
> URL: https://issues.apache.org/jira/browse/HIVE-3508
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
> Attachments: HIVE-3508.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7079) Hive logs errors about missing tables when parsing CTE expressions

2014-06-09 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026001#comment-14026001
 ] 

Navis commented on HIVE-7079:
-

Hmm.. cannot reproduce union31 and hbase_joins.

> Hive logs errors about missing tables when parsing CTE expressions
> --
>
> Key: HIVE-7079
> URL: https://issues.apache.org/jira/browse/HIVE-7079
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.0
>Reporter: Craig Condit
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7079.1.patch.txt
>
>
> Given a query containing common table expressions (CTE) such as:
> WITH a AS (SELECT ...), b AS (SELECT ...)
> SELECT * FROM a JOIN b on a.col = b.col ...;
> Hive CLI executes the query, but logs stack traces at ERROR level during 
> query parsing:
> {noformat}
> ERROR metadata.Hive: NoSuchObjectException(message:ccondit.a table not found)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29338)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29306)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result.read(ThriftHiveMetastore.java:29237)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1036)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1022)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>   at com.sun.proxy.$Proxy7.getTable(Unknown Source)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:967)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:909)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1223)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1192)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9209)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:391)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:291)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:944)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1009)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {noformat}
> It looks like Hive is attempting to resolve the CTE aliases as physical 
> tables.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7074) The reducer parallelism should be a prime number for better stride protection

2014-06-09 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025996#comment-14025996
 ] 

Gunther Hagleitner commented on HIVE-7074:
--

I believe that's unnecessary given HIVE-7121. Once we do a decent job of 
hashing the keys the prime number reducer requirement goes away.

> The reducer parallelism should be a prime number for better stride protection
> -
>
> Key: HIVE-7074
> URL: https://issues.apache.org/jira/browse/HIVE-7074
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7074.1.patch
>
>
> The current hive reducer parallelism results in stride issues with key 
> distribution.
> a JOIN generating even numbers will get strided onto only some of the 
> reducers.
> The probability of distribution skew is controlled by the number of common 
> factors shared by the hashcode of the key and the number of buckets.
> Using a prime number within the reducer estimation will cut that probability 
> down by a significant amount.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7196) Configure session by single open session call

2014-06-09 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025995#comment-14025995
 ] 

Navis commented on HIVE-7196:
-

Cannot reproduce fail of auto_join1.q. Others are well-known failers.

> Configure session by single open session call
> -
>
> Key: HIVE-7196
> URL: https://issues.apache.org/jira/browse/HIVE-7196
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-7196.1.patch.txt
>
>
> Currently, jdbc2 connection executes set command for each conf/vars, which 
> can be embedded in TOpenSessionReq.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 22404: Optimize limit 0

2014-06-09 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22404/
---

Review request for hive.


Bugs: HIVE-7203
https://issues.apache.org/jira/browse/HIVE-7203


Repository: hive-git


Description
---

Optimize limit 0


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 68f1153 
  ql/src/test/queries/clientpositive/limit0.q PRE-CREATION 
  ql/src/test/results/clientpositive/limit0.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/22404/diff/


Testing
---

Added new test.


Thanks,

Ashutosh Chauhan

[jira] [Updated] (HIVE-7203) Optimize limit 0

2014-06-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7203:
---

Status: Patch Available  (was: Open)

> Optimize limit 0
> 
>
> Key: HIVE-7203
> URL: https://issues.apache.org/jira/browse/HIVE-7203
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-7203.patch
>
>
> Some tools generate queries with limit 0. Lets optimize that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7203) Optimize limit 0

2014-06-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7203:
---

Attachment: HIVE-7203.patch

> Optimize limit 0
> 
>
> Key: HIVE-7203
> URL: https://issues.apache.org/jira/browse/HIVE-7203
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-7203.patch
>
>
> Some tools generate queries with limit 0. Lets optimize that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7203) Optimize limit 0

2014-06-09 Thread Ashutosh Chauhan (JIRA)

Ashutosh Chauhan created HIVE-7203:
--

 Summary: Optimize limit 0
 Key: HIVE-7203
 URL: https://issues.apache.org/jira/browse/HIVE-7203
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Some tools generate queries with limit 0. Lets optimize that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7159) For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition

2014-06-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025949#comment-14025949
 ] 

Hive QA commented on HIVE-7159:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12647779/HIVE-7159.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/418/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/418/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-418/

Messages:
{noformat}
 This message was trimmed, see log for full details 
Decision can match input such as "LPAREN KW_CASE SmallintLiteral" using 
multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:115:5: 
Decision can match input such as "KW_CLUSTER KW_BY LPAREN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:127:5: 
Decision can match input such as "KW_PARTITION KW_BY LPAREN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:138:5: 
Decision can match input such as "KW_DISTRIBUTE KW_BY LPAREN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:149:5: 
Decision can match input such as "KW_SORT KW_BY LPAREN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:166:7: 
Decision can match input such as "STAR" using multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:179:5: 
Decision can match input such as "KW_STRUCT" using multiple alternatives: 4, 6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:179:5: 
Decision can match input such as "KW_UNIONTYPE" using multiple alternatives: 5, 
6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:179:5: 
Decision can match input such as "KW_ARRAY" using multiple alternatives: 2, 6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:261:5: 
Decision can match input such as "KW_DATE StringLiteral" using multiple 
alternatives: 2, 3

As a result, alternative(s) 3 were disabled for that input
warning(200): IdentifiersParser.g:261:5: 
Decision can match input such as "KW_FALSE" using multiple alternatives: 3, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:261:5: 
Decision can match input such as "KW_TRUE" using multiple alternatives: 3, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:261:5: 
Decision can match input such as "KW_NULL" using multiple alternatives: 1, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_INSERT 
KW_OVERWRITE" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_DISTRIBUTE 
KW_BY" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_MAP LPAREN" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_INSERT 
KW_INTO" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_LATERAL 
KW_VIEW" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_GROUP 
KW_BY" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "KW_BETWEEN KW_MAP LPAREN" using multiple 
alternatives: 8, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_ORDER 
KW_BY" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g

[jira] [Commented] (HIVE-7182) ResultSet is not closed in JDBCStatsPublisher#init()

2014-06-09 Thread steve, Oh (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025947#comment-14025947
 ] 

steve, Oh commented on HIVE-7182:
-

executeUpdate wasn't moved outside the if. I moved stmt.close() & 
closeConnection() finally block.
{noformat}
 if (!tblExists) { // Table does not exist, create it
   String createTable = JDBCStatsUtils.getCreate("");
-  stmt.executeUpdate(createTable);
-  stmt.close();
-}
-closeConnection();
+  stmt.executeUpdate(createTable);  
+}  
{noformat}

> ResultSet is not closed in JDBCStatsPublisher#init()
> 
>
> Key: HIVE-7182
> URL: https://issues.apache.org/jira/browse/HIVE-7182
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Attachments: HIVE-7182.patch
>
>
> {code}
> ResultSet rs = dbm.getTables(null, null, 
> JDBCStatsUtils.getStatTableName(), null);
> boolean tblExists = rs.next();
> {code}
> rs is not closed upon return from init()
> If stmt.executeUpdate() throws exception, stmt.close() would be skipped - the 
> close() call should be placed in finally block.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7182) ResultSet is not closed in JDBCStatsPublisher#init()

2014-06-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025942#comment-14025942
 ] 

Hive QA commented on HIVE-7182:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12648909/HIVE-7182.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/417/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/417/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-417/

Messages:
{noformat}
 This message was trimmed, see log for full details 
As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:68:4: 
Decision can match input such as "LPAREN KW_CASE KW_ARRAY" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:68:4: 
Decision can match input such as "LPAREN KW_CASE TinyintLiteral" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:68:4: 
Decision can match input such as "LPAREN KW_CASE KW_STRUCT" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:68:4: 
Decision can match input such as "LPAREN KW_CASE SmallintLiteral" using 
multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:115:5: 
Decision can match input such as "KW_CLUSTER KW_BY LPAREN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:127:5: 
Decision can match input such as "KW_PARTITION KW_BY LPAREN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:138:5: 
Decision can match input such as "KW_DISTRIBUTE KW_BY LPAREN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:149:5: 
Decision can match input such as "KW_SORT KW_BY LPAREN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:166:7: 
Decision can match input such as "STAR" using multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:179:5: 
Decision can match input such as "KW_STRUCT" using multiple alternatives: 4, 6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:179:5: 
Decision can match input such as "KW_UNIONTYPE" using multiple alternatives: 5, 
6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:179:5: 
Decision can match input such as "KW_ARRAY" using multiple alternatives: 2, 6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:261:5: 
Decision can match input such as "KW_DATE StringLiteral" using multiple 
alternatives: 2, 3

As a result, alternative(s) 3 were disabled for that input
warning(200): IdentifiersParser.g:261:5: 
Decision can match input such as "KW_FALSE" using multiple alternatives: 3, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:261:5: 
Decision can match input such as "KW_TRUE" using multiple alternatives: 3, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:261:5: 
Decision can match input such as "KW_NULL" using multiple alternatives: 1, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_INSERT 
KW_OVERWRITE" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_DISTRIBUTE 
KW_BY" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_MAP LPAREN" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_INSERT 
KW_INTO" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:393:5: 
Decision can match input such as "{KW_LIKE, KW_REGEXP, KW_RLIKE} KW_LATERAL 
KW_VIEW" using multiple alternatives: 2, 9

As a result, alternative(s) 9 were

[jira] [Commented] (HIVE-7196) Configure session by single open session call

2014-06-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025940#comment-14025940
 ] 

Hive QA commented on HIVE-7196:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12648933/HIVE-7196.1.patch.txt

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 5533 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY
org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/416/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/416/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-416/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12648933

> Configure session by single open session call
> -
>
> Key: HIVE-7196
> URL: https://issues.apache.org/jira/browse/HIVE-7196
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-7196.1.patch.txt
>
>
> Currently, jdbc2 connection executes set command for each conf/vars, which 
> can be embedded in TOpenSessionReq.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7202) DbTxnManager deadlocks in hcatalog.cli.TestSematicAnalysis.testAlterTblFFpart()

2014-06-09 Thread Eugene Koifman (JIRA)

Eugene Koifman created HIVE-7202:


 Summary: DbTxnManager deadlocks in 
hcatalog.cli.TestSematicAnalysis.testAlterTblFFpart()
 Key: HIVE-7202
 URL: https://issues.apache.org/jira/browse/HIVE-7202
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Alan Gates


select * from HIVE_LOCKS produces

{noformat}
6   |1   |0   |default  

   |junit_sem_analysis  

|NULL   
 |w|r|1402354627716   |NULL 
   |unknown 
|ekoifman.local 

 
6   |2   |0   |default  

   |junit_sem_analysis  

|b=2010-10-10   
 |w|e|1402354627716   |NULL 
   |unknown 
|ekoifman.local 

 

2 rows selected
{noformat}

easiest way to repro this is to add
hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, true);
hiveConf.setVar(HiveConf.ConfVars.HIVE_TXN_MANAGER, 
"org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");

in HCatBaseTest.setUpHiveConf()



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025896#comment-14025896
 ] 

Lefty Leverenz commented on HIVE-7065:
--

[~eugene.koifman], I don't know if there's a way to autogenerate wiki docs from 
xml files, but it would certainly be useful for Hive configs as well as WebHCat.

Confluence has this documentation with links to various possibilities:  
[Importing Content into Confluence -- Importing other non-wiki content | 
https://confluence.atlassian.com/display/DOC/Importing+Content+Into+Confluence#ImportingContentIntoConfluence-Importingothernon-wikicontent].

So we can follow those links and see if there's an easy way to autogenerate 
wikidocs, but I won't have much time for the research until I whittle down my 
backlog of 0.13 doc tasks.

In the meantime, to get a single source of truth I suggest adding a comparison 
of webhcat-default.xml vs. wikidoc to the release checklist.  Unlike Hive, 
WebHCat doesn't have a huge number of parameters so manually fixing the wiki 
for each release wouldn't be difficult.

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-6983) Remove test.warehouse.scheme parameter

2014-06-09 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan resolved HIVE-6983.


Resolution: Won't Fix

> Remove test.warehouse.scheme parameter
> --
>
> Key: HIVE-6983
> URL: https://issues.apache.org/jira/browse/HIVE-6983
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-6983.patch
>
>
> There is a parameter "test.warehouse.scheme" that does not seem to achieve 
> much in the tests that use it, but causes issues in some environments like 
> the windows environment when trying to test for path equality. We also seem 
> to explicitly null it out for hive itests because it has previously been set.
> We should remove it as something that's not useful anymore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6983) Remove test.warehouse.scheme parameter

2014-06-09 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025891#comment-14025891
 ] 

Sushanth Sowmyan commented on HIVE-6983:


Canceling altogether as WONTFIX, looks like we do need this parameter for some 
tests, as of now, anyway. It then becomes something testcases need to take care 
of to check dirs for equality instead.

> Remove test.warehouse.scheme parameter
> --
>
> Key: HIVE-6983
> URL: https://issues.apache.org/jira/browse/HIVE-6983
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-6983.patch
>
>
> There is a parameter "test.warehouse.scheme" that does not seem to achieve 
> much in the tests that use it, but causes issues in some environments like 
> the windows environment when trying to test for path equality. We also seem 
> to explicitly null it out for hive itests because it has previously been set.
> We should remove it as something that's not useful anymore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7085) TestOrcHCatPigStorer.testWriteDecimal tests are failing on trunk

2014-06-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025867#comment-14025867
 ] 

Ashutosh Chauhan commented on HIVE-7085:


Committed to trunk. Thanks, Navis!

> TestOrcHCatPigStorer.testWriteDecimal tests are failing on trunk
> 
>
> Key: HIVE-7085
> URL: https://issues.apache.org/jira/browse/HIVE-7085
> Project: Hive
>  Issue Type: Test
>  Components: HCatalog
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Navis
> Fix For: 0.14.0
>
> Attachments: HIVE-7085.1.patch.txt
>
>
> TestOrcHCatPigStorer.testWriteDecimal, 
> TestOrcHCatPigStorer.testWriteDecimalX, 
> TestOrcHCatPigStorer.testWriteDecimalXY
> are failing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7085) TestOrcHCatPigStorer.testWriteDecimal tests are failing on trunk

2014-06-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7085:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
 Assignee: Navis
   Status: Resolved  (was: Patch Available)

> TestOrcHCatPigStorer.testWriteDecimal tests are failing on trunk
> 
>
> Key: HIVE-7085
> URL: https://issues.apache.org/jira/browse/HIVE-7085
> Project: Hive
>  Issue Type: Test
>  Components: HCatalog
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Navis
> Fix For: 0.14.0
>
> Attachments: HIVE-7085.1.patch.txt
>
>
> TestOrcHCatPigStorer.testWriteDecimal, 
> TestOrcHCatPigStorer.testWriteDecimalX, 
> TestOrcHCatPigStorer.testWriteDecimalXY
> are failing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7201) Fix TestHiveConf#testConfProperties test case

2014-06-09 Thread Pankit Thapar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankit Thapar updated HIVE-7201:


Status: Patch Available  (was: Open)

> Fix TestHiveConf#testConfProperties test case
> -
>
> Key: HIVE-7201
> URL: https://issues.apache.org/jira/browse/HIVE-7201
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.13.0
>Reporter: Pankit Thapar
>Priority: Minor
> Attachments: HIVE-7201.patch
>
>
> CHANGE 1: 
> TEST CASE :
> The intention of TestHiveConf#testConfProperties() is to test the HiveConf 
> properties being set in the priority as expected.
> Each HiveConf object is initialized as follows:
> 1) Hadoop configuration properties are applied.
> 2) ConfVar properties with non-null values are overlayed.
> 3) hive-site.xml properties are overlayed.
> ISSUE :
> The mapreduce related configurations are loaded by JobConf and not 
> Configuration.
> The current test tries to get the configuration properties  like : 
> HADOOPNUMREDUCERS ("mapred.job.reduces")
> from Configuration class. But these mapreduce related properties are loaded 
> by JobConf class from mapred-default.xml.
> DETAILS :
> LINE  63 : checkHadoopConf(ConfVars.HADOOPNUMREDUCERS.varname, "1"); -->fails
> Because, 
> private void  checkHadoopConf(String name, String expectedHadoopVal) {
>  Assert.assertEquals(expectedHadoopVal, new Configuration().get(name)); 
> > Second parameter is null, since its the JobConf class and not the 
> Configuration class that initializes mapred-default values. 
> }
> Code that loads mapreduce resources is in ConfigUtil and JobConf makes a call 
> like this (in static block):
> public class JobConf extends Configuration {
>   
>   private static final Log LOG = LogFactory.getLog(JobConf.class);
>   static{
> ConfigUtil.loadResources(); --> loads mapreduce related resources 
> (mapreduce-default.xml)
>   }
> .
> }
> Please note, the test case assertion works fine if HiveConf() constructor is 
> called before this assertion since, HiveConf() triggers JobConf()
> which basically sets the default values of the properties pertaining to 
> mapreduce.
> This is why, there won't be any failures if testHiveSitePath() was run before 
> testConfProperties() as that would load mapreduce
> properties into config properties.
> FIX:
> Instead of using a Configuration object, we can use the JobConf object to get 
> the default values used by hadoop/mapreduce.
> CHANGE 2:
> In TestHiveConf#testHiveSitePath(), a call to static method 
> getHiveSiteLocation() should be called statically instead of using an object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7201) Fix TestHiveConf#testConfProperties test case

2014-06-09 Thread Pankit Thapar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankit Thapar updated HIVE-7201:


Attachment: HIVE-7201.patch

> Fix TestHiveConf#testConfProperties test case
> -
>
> Key: HIVE-7201
> URL: https://issues.apache.org/jira/browse/HIVE-7201
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.13.0
>Reporter: Pankit Thapar
>Priority: Minor
> Attachments: HIVE-7201.patch
>
>
> CHANGE 1: 
> TEST CASE :
> The intention of TestHiveConf#testConfProperties() is to test the HiveConf 
> properties being set in the priority as expected.
> Each HiveConf object is initialized as follows:
> 1) Hadoop configuration properties are applied.
> 2) ConfVar properties with non-null values are overlayed.
> 3) hive-site.xml properties are overlayed.
> ISSUE :
> The mapreduce related configurations are loaded by JobConf and not 
> Configuration.
> The current test tries to get the configuration properties  like : 
> HADOOPNUMREDUCERS ("mapred.job.reduces")
> from Configuration class. But these mapreduce related properties are loaded 
> by JobConf class from mapred-default.xml.
> DETAILS :
> LINE  63 : checkHadoopConf(ConfVars.HADOOPNUMREDUCERS.varname, "1"); -->fails
> Because, 
> private void  checkHadoopConf(String name, String expectedHadoopVal) {
>  Assert.assertEquals(expectedHadoopVal, new Configuration().get(name)); 
> > Second parameter is null, since its the JobConf class and not the 
> Configuration class that initializes mapred-default values. 
> }
> Code that loads mapreduce resources is in ConfigUtil and JobConf makes a call 
> like this (in static block):
> public class JobConf extends Configuration {
>   
>   private static final Log LOG = LogFactory.getLog(JobConf.class);
>   static{
> ConfigUtil.loadResources(); --> loads mapreduce related resources 
> (mapreduce-default.xml)
>   }
> .
> }
> Please note, the test case assertion works fine if HiveConf() constructor is 
> called before this assertion since, HiveConf() triggers JobConf()
> which basically sets the default values of the properties pertaining to 
> mapreduce.
> This is why, there won't be any failures if testHiveSitePath() was run before 
> testConfProperties() as that would load mapreduce
> properties into config properties.
> FIX:
> Instead of using a Configuration object, we can use the JobConf object to get 
> the default values used by hadoop/mapreduce.
> CHANGE 2:
> In TestHiveConf#testHiveSitePath(), a call to static method 
> getHiveSiteLocation() should be called statically instead of using an object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7201) Fix TestHiveConf#testConfProperties test case

2014-06-09 Thread Pankit Thapar (JIRA)

Pankit Thapar created HIVE-7201:
---

 Summary: Fix TestHiveConf#testConfProperties test case
 Key: HIVE-7201
 URL: https://issues.apache.org/jira/browse/HIVE-7201
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Pankit Thapar
Priority: Minor


CHANGE 1: 

TEST CASE :
The intention of TestHiveConf#testConfProperties() is to test the HiveConf 
properties being set in the priority as expected.

Each HiveConf object is initialized as follows:
1) Hadoop configuration properties are applied.
2) ConfVar properties with non-null values are overlayed.
3) hive-site.xml properties are overlayed.

ISSUE :
The mapreduce related configurations are loaded by JobConf and not 
Configuration.
The current test tries to get the configuration properties  like : 
HADOOPNUMREDUCERS ("mapred.job.reduces")
from Configuration class. But these mapreduce related properties are loaded by 
JobConf class from mapred-default.xml.

DETAILS :
LINE  63 : checkHadoopConf(ConfVars.HADOOPNUMREDUCERS.varname, "1"); -->fails
Because, 
private void  checkHadoopConf(String name, String expectedHadoopVal) {
 Assert.assertEquals(expectedHadoopVal, new Configuration().get(name)); 
> Second parameter is null, since its the JobConf class and not the 
Configuration class that initializes mapred-default values. 
}

Code that loads mapreduce resources is in ConfigUtil and JobConf makes a call 
like this (in static block):
public class JobConf extends Configuration {
  
  private static final Log LOG = LogFactory.getLog(JobConf.class);

  static{
ConfigUtil.loadResources(); --> loads mapreduce related resources 
(mapreduce-default.xml)
  }
.
}

Please note, the test case assertion works fine if HiveConf() constructor is 
called before this assertion since, HiveConf() triggers JobConf()
which basically sets the default values of the properties pertaining to 
mapreduce.
This is why, there won't be any failures if testHiveSitePath() was run before 
testConfProperties() as that would load mapreduce
properties into config properties.


FIX:
Instead of using a Configuration object, we can use the JobConf object to get 
the default values used by hadoop/mapreduce.

CHANGE 2:
In TestHiveConf#testHiveSitePath(), a call to static method 
getHiveSiteLocation() should be called statically instead of using an object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7072) HCatLoader only loads first region of hbase table

2014-06-09 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7072:
---

Status: Patch Available  (was: Open)

> HCatLoader only loads first region of hbase table
> -
>
> Key: HIVE-7072
> URL: https://issues.apache.org/jira/browse/HIVE-7072
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-7072.2.patch, HIVE-7072.3.patch
>
>
> Pig needs a config parameter 'pig.noSplitCombination' set to 'true' for it to 
> be able to read HBaseStorageHandler-based tables.
> This is done in the HBaseLoader at getSplits time, but HCatLoader does not do 
> so, which results in only a partial data load.
> Thus, we need one more special case definition in HCat, that sets this 
> parameter in the job properties if we detect that we're loading a 
> HBaseStorageHandler based table. (Note, also, that we should not depend 
> directly on the HBaseStorageHandler class, and instead depend on the name of 
> the class, since we do not want a mvn dependency on hive-hbase-handler to be 
> able to compile HCatalog core, since it's conceivable that at some time, 
> there might be a reverse dependency.) The primary issue is one of where this 
> code should go, since it doesn't belong in pig (pig does not know what loader 
> behaviour should be, and this parameter is its interface to a loader), and 
> doesn't belong in the HBaseStorageHandler either, since that's implementing a 
> HiveStorageHandler and is connecting up the two. Thus, this should belong to 
> HCatLoader. Setting this parameter across the board results in poor 
> performance for HCatLoader, so it must only be set when using with HBase.
> Thus, it belongs in the SpecialCases definition as that was created 
> specifically for these kinds of odd cases, and can be called from within 
> HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7072) HCatLoader only loads first region of hbase table

2014-06-09 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7072:
---

Attachment: HIVE-7072.3.patch

Updated patch.

> HCatLoader only loads first region of hbase table
> -
>
> Key: HIVE-7072
> URL: https://issues.apache.org/jira/browse/HIVE-7072
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-7072.2.patch, HIVE-7072.3.patch
>
>
> Pig needs a config parameter 'pig.noSplitCombination' set to 'true' for it to 
> be able to read HBaseStorageHandler-based tables.
> This is done in the HBaseLoader at getSplits time, but HCatLoader does not do 
> so, which results in only a partial data load.
> Thus, we need one more special case definition in HCat, that sets this 
> parameter in the job properties if we detect that we're loading a 
> HBaseStorageHandler based table. (Note, also, that we should not depend 
> directly on the HBaseStorageHandler class, and instead depend on the name of 
> the class, since we do not want a mvn dependency on hive-hbase-handler to be 
> able to compile HCatalog core, since it's conceivable that at some time, 
> there might be a reverse dependency.) The primary issue is one of where this 
> code should go, since it doesn't belong in pig (pig does not know what loader 
> behaviour should be, and this parameter is its interface to a loader), and 
> doesn't belong in the HBaseStorageHandler either, since that's implementing a 
> HiveStorageHandler and is connecting up the two. Thus, this should belong to 
> HCatLoader. Setting this parameter across the board results in poor 
> performance for HCatLoader, so it must only be set when using with HBase.
> Thus, it belongs in the SpecialCases definition as that was created 
> specifically for these kinds of odd cases, and can be called from within 
> HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7175) Provide password file option to beeline

2014-06-09 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025819#comment-14025819
 ] 

Brock Noland commented on HIVE-7175:


Also could you add a nice message given an exception in obtainPasswordFromFile.

> Provide password file option to beeline
> ---
>
> Key: HIVE-7175
> URL: https://issues.apache.org/jira/browse/HIVE-7175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 0.13.0
>Reporter: Robert Justice
>Assignee: Dr. Wendell Urth
>  Labels: features, security
> Attachments: HIVE-7175.patch
>
>
> For people connecting to Hive Server 2 with LDAP authentication enabled, in 
> order to batch run commands, we currently have to provide the password openly 
> in the command line.   They could use some expect scripting, but I think a 
> valid improvement would be to provide a password file option similar to other 
> CLI commands in hadoop (e.g. sqoop) to be more secure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7175) Provide password file option to beeline

2014-06-09 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7175:
---

Assignee: Dr. Wendell Urth

> Provide password file option to beeline
> ---
>
> Key: HIVE-7175
> URL: https://issues.apache.org/jira/browse/HIVE-7175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 0.13.0
>Reporter: Robert Justice
>Assignee: Dr. Wendell Urth
>  Labels: features, security
> Attachments: HIVE-7175.patch
>
>
> For people connecting to Hive Server 2 with LDAP authentication enabled, in 
> order to batch run commands, we currently have to provide the password openly 
> in the command line.   They could use some expect scripting, but I think a 
> valid improvement would be to provide a password file option similar to other 
> CLI commands in hadoop (e.g. sqoop) to be more secure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

TestMiniTezCliDriver is not always running

2014-06-09 Thread Brock Noland

+ Dev for awareness

Thank you Szehon for your investigation Szehon. It also looks like
when TestMiniTezCliDriver
is not run, it times out. Logs are here:

http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-408/failed/TestMiniTezCliDriver/

Only thing of note, is the exception below.


2014-06-07 21:07:30,489 DEBUG rpc.DAGClientRPCImpl
(DAGClientRPCImpl.java:resetProxy(144)) - Resetting AM proxy for app:
application_1402200067997_0012 dag:dag_1402200067997_
12_01 due to exception :
org.apache.tez.dag.api.TezException: com.google.protobuf.ServiceException:
org.apache.hadoop.ipc.RemoteException(org.apache.tez.dag.api.TezException):
No running dag at prese
nt
at
org.apache.tez.dag.app.DAGAppMaster$DAGClientHandler.getDAG(DAGAppMaster.java:1035)
at
org.apache.tez.dag.app.DAGAppMaster$DAGClientHandler.getDAGStatus(DAGAppMaster.java:1013)
at
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:79)
at
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:8286)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at
org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.getDAGStatusViaAM(DAGClientRPCImpl.java:170)
at
org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.getDAGStatus(DAGClientRPCImpl.java:83)
at
org.apache.tez.mapreduce.client.YARNRunner.getJobStatus(YARNRunner.java:673)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
at org.apache.hadoop.mapreduce.Job.getJobState(Job.java:347)
at
org.apache.hadoop.mapred.JobClient$NetworkedJob.getJobState(JobClient.java:295)
at
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:243)
at
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:546)
at
org.apache.hadoop.hive.ql.io.rcfile.merge.BlockMergeTask.execute(BlockMergeTask.java:216)
at
org.apache.hadoop.hive.ql.exec.DDLTask.mergeFiles(DDLTask.java:520)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:467)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1507)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1273)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1091)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:904)
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:272)
at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:224)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:434)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:920)
at
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:644)
at
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_create_merge_compressed(TestMiniTezCliDriver.java:368)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

  at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run

[jira] [Commented] (HIVE-7110) TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile

2014-06-09 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025778#comment-14025778
 ] 

Szehon Ho commented on HIVE-7110:
-

Yea, that is the right plugin, it creates the /tmp folder under /target before 
the test, which has the right hive-site.xml.  I see it propagating fine for 
/hcatalog/core, I'm not sure why you guys see the error.  Does it run when you 
run tests in other projects?

> TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile
> -
>
> Key: HIVE-7110
> URL: https://issues.apache.org/jira/browse/HIVE-7110
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: David Chen
>Assignee: David Chen
> Attachments: HIVE-7110.1.patch, HIVE-7110.2.patch, HIVE-7110.3.patch, 
> HIVE-7110.4.patch
>
>
> I got the following TestHCatPartitionPublish test failure when running all 
> unit tests against Hadoop 1. This also appears when testing against Hadoop 2.
> {code}
>  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.06 sec 
> <<< FAILURE! - in org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish
> testPartitionPublish(org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish)
>   Time elapsed: 1.361 sec  <<< ERROR!
> org.apache.hive.hcatalog.common.HCatException: 
> org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output 
> information. Cause : java.io.IOException: No FileSystem for scheme: pfile
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1443)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:212)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
> at 
> org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.runMRCreateFail(TestHCatPartitionPublish.java:191)
> at 
> org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:155)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7115) Support a mechanism for running hive locally that doesnt require having a hadoop executable.

2014-06-09 Thread Edward Capriolo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025767#comment-14025767
 ] 

Edward Capriolo commented on HIVE-7115:
---

That would be really nice especially if it could be extended to dependent 
projects 
https://github.com/edwardcapriolo/hive_test requires lots of trickery to launch 
a hive process.

> Support a mechanism for running hive locally that doesnt require having a 
> hadoop executable.
> 
>
> Key: HIVE-7115
> URL: https://issues.apache.org/jira/browse/HIVE-7115
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure, Tests
>Reporter: jay vyas
>
> Mapreduce has a "local" mode by default, and likewise, tools such as pig and 
> SOLR do as well, maybe we can have a first class local mode for hive 
> also. 
> For local integration testing of a hadoop app, it would be nice if we could 
> fire up a local hive instance which didnt require "bin/hadoop" for running 
> local jobs.  This would allow us to maintain polyglot hadoop applications 
> much easier by incorporating hive into the integration tests.  For example:
> {noformat}
> LocalHiveInstance hive = new LocalHiveInstance();
> hive.set("course","crochet")l
> hive.runScript("hive_flow.ql")l
> {noformat} 
> Would essentially run a local hive query which mirrors
> {noformat}
> hive -f hive_flow.ql -hiveconf course=crochet
> {noformat{ 
> It seems like thee might be a simple way to do this, at least for small data 
> sets, by putting some kind of alternative (i.e. in memory) execution 
> environment under hive, if one is not already underway ?  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 22329: HIVE-7190. WebHCat launcher task failure can cause two concurent user jobs to run

2014-06-09 Thread Ivan Mitic



> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > 1. I think webhcat-default.xml should be modified to include the jars that 
> > are now required in templeton.libjars to minimize out-of-the-box config for 
> > end users.
> > 2. Is there any test (e2e) that can be added for this? (with reasonable 
> > amount of effort)
> > 3. When you tested that Pig/Hive jobs get properly tagged, you mean you 
> > tested that MR jobs that are generated by Pig/Hive are tagged, correct?
> 
> Eugene Koifman wrote:
> 4. Actually, instead of doing 1, could WebHCat dynamically figure out 
> which hadoop version it's talking to and add only the necessary shim jar, 
> rather than shipping all of them?  It reduces the amount of config needed.  
> It would also be better if we can only ship the minimal set of jars.
>

1. I like your proposal from #4. I actually started this route but run into 
some issues when I tried to add libjars programmatically. Let me try harder and 
I'll reply back. 
2. Will have to check out what we have currently.
3. Correct, I validated that MR jobs generated by Pig/Hive are tagged properly. 


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java,
> >  line 44
> > 
> >
> > I think it would be useful to add a more detailed description of these 
> > props.  Something like what is in the JIRA ticket.  I would have added the 
> > ticket number to the comment, but Hive prohibits that.

Will fix this, thanks


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/LaunchMapper.java,
> >  line 126
> > 
> >
> > Which user will this use?  Is it the user running WebHCat or the value 
> > of 'doAs' parameter?

This is running in the context of the task itself. In unsecure hadoop this is 
in the same context as nodemanager/tasktracker. In secure hadoop I believe this 
is in the context of the user submitting the job.


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java, 
> > line 157
> > 
> >
> > Is LOG.info() the right log level?  Seems like it will pollute the log 
> > file.

I think this is totally fine, it's just a single entry in the task syslog. This 
is super useful info (IMO must have) for users to understand what templeton 
launcher job does.


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java, 
> > line 160
> > 
> >
> > Is LOG.info() the right level?

I think this is ok.


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java, 
> > line 189
> > 
> >
> > log level

Same as above, I think this is ok. 


- Ivan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22329/#review44992
---


On June 6, 2014, 10:02 p.m., Ivan Mitic wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22329/
> ---
> 
> (Updated June 6, 2014, 10:02 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Approach in the patch is similar to what Oozie does to handle this situation. 
> Specifically, all child map jobs get tagged with the launcher MR job id. On 
> launcher task restart, launcher queries RM for the list of jobs that have the 
> tag and kills them. After that it moves on to start the same child job again. 
> Again, similarly to what Oozie does, a new templeton.job.launch.time property 
> is introduced that captures the launcher job submit timestamp and later used 
> to reduce the search window when RM is queried. 
> 
> To validate the patch, you will need to add webhcat shim jars to 
> templeton.libjars as now webhcat launcher also has a dependency on hadoop 
> shims. 
> 
> I have noticed that in case of the SqoopDelegator webhcat currently does not 
> set the MR delegation token when optionsFile flag is used. This also creates 
> the problem in this scenario. This looks like something that should be 
> handled via a separate Jira.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/HiveDelegator.java
>  23b1c4f 
>

[jira] [Commented] (HIVE-7110) TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile

2014-06-09 Thread David Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025752#comment-14025752
 ] 

David Chen commented on HIVE-7110:
--

[~szehon] Is the plugin you mentioned the {{process-resources}} and 
{{process-test-resources}} executions of the {{maven-antrun-plugins}} under the 
root pom.xml? Were the properties set by this plugins supposed to propagate to 
the other Maven modules/subprojects?

> TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile
> -
>
> Key: HIVE-7110
> URL: https://issues.apache.org/jira/browse/HIVE-7110
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: David Chen
>Assignee: David Chen
> Attachments: HIVE-7110.1.patch, HIVE-7110.2.patch, HIVE-7110.3.patch, 
> HIVE-7110.4.patch
>
>
> I got the following TestHCatPartitionPublish test failure when running all 
> unit tests against Hadoop 1. This also appears when testing against Hadoop 2.
> {code}
>  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.06 sec 
> <<< FAILURE! - in org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish
> testPartitionPublish(org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish)
>   Time elapsed: 1.361 sec  <<< ERROR!
> org.apache.hive.hcatalog.common.HCatException: 
> org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output 
> information. Cause : java.io.IOException: No FileSystem for scheme: pfile
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1443)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:212)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
> at 
> org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.runMRCreateFail(TestHCatPartitionPublish.java:191)
> at 
> org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:155)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6683) Beeline does not accept comments at end of line

2014-06-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025751#comment-14025751
 ] 

Nicolas Thiébaud commented on HIVE-6683:


I am also affected by this. It would be nice it it were fixed server side so 
clients do not have to implement comment parsing. It would also make reusing 
queries across beeline / hue / jdbc much easier

> Beeline does not accept comments at end of line
> ---
>
> Key: HIVE-6683
> URL: https://issues.apache.org/jira/browse/HIVE-6683
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0
>Reporter: Jeremy Beard
>
> Beeline fails to read queries where lines have comments at the end. This 
> works in the embedded Hive CLI.
> Example:
> SELECT
> 1 -- this is a comment about this value
> FROM
> table;
> Error: Error while processing statement: FAILED: ParseException line 1:36 
> mismatched input '' expecting FROM near '1' in from clause 
> (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7177) percentile_approx very inaccurate with high multiplicities in the data

2014-06-09 Thread Tom Temple (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Temple updated HIVE-7177:
-

Environment: Redhat 5.10 running Cloudera 5.0.1  (was: Redhat 5.10 running 
Cloudera 5.0.0)

> percentile_approx very inaccurate with high multiplicities in the data
> --
>
> Key: HIVE-7177
> URL: https://issues.apache.org/jira/browse/HIVE-7177
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.12.0
> Environment: Redhat 5.10 running Cloudera 5.0.1
>Reporter: Tom Temple
>
> To reproduce:
> 1) create a table with a single integer column
> 2) with values: 1 million, 2 million, 3 million, and 4 million each repeated 
> a quarter million times.
> 3) percentile_approx(cast(col_0 as double), array(0.33,0.34),100)
> Expected results: [200.0,200.0]
> Actual results: [128.0,132.0] (I might be off by 4 here)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7115) Support a mechanism for running hive locally that doesnt require having a hadoop executable.

2014-06-09 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025745#comment-14025745
 ] 

Sushanth Sowmyan commented on HIVE-7115:


Oh, and one more thing : there is a miniHS2 implementation in the tests : 
hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java

I would recommend experimenting with this more than the other approaches, 
because it's cleaner from an integration standpoint. You can then use jdbc or 
beeline to connect to it.

> Support a mechanism for running hive locally that doesnt require having a 
> hadoop executable.
> 
>
> Key: HIVE-7115
> URL: https://issues.apache.org/jira/browse/HIVE-7115
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure, Tests
>Reporter: jay vyas
>
> Mapreduce has a "local" mode by default, and likewise, tools such as pig and 
> SOLR do as well, maybe we can have a first class local mode for hive 
> also. 
> For local integration testing of a hadoop app, it would be nice if we could 
> fire up a local hive instance which didnt require "bin/hadoop" for running 
> local jobs.  This would allow us to maintain polyglot hadoop applications 
> much easier by incorporating hive into the integration tests.  For example:
> {noformat}
> LocalHiveInstance hive = new LocalHiveInstance();
> hive.set("course","crochet")l
> hive.runScript("hive_flow.ql")l
> {noformat} 
> Would essentially run a local hive query which mirrors
> {noformat}
> hive -f hive_flow.ql -hiveconf course=crochet
> {noformat{ 
> It seems like thee might be a simple way to do this, at least for small data 
> sets, by putting some kind of alternative (i.e. in memory) execution 
> environment under hive, if one is not already underway ?  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7115) Support a mechanism for running hive locally that doesnt require having a hadoop executable.

2014-06-09 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025735#comment-14025735
 ] 

Sushanth Sowmyan commented on HIVE-7115:


Hi Jay,

There are two things I can think of that might help development of such a tool:

a) One possible place to look into for this is to look at the test framework 
section. For example, look at our TestCli framework over at 
ql/src/test/templates/TestCliDriver.vm and 
./ql/src/test/templates/TestNegativeCliDriver.vm which manage our .q and .q.out 
tests.
b) The other route I'd suggest is by directly launching a Hive Driver, as done 
in tests like 
hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestE2EScenarios.java

> Support a mechanism for running hive locally that doesnt require having a 
> hadoop executable.
> 
>
> Key: HIVE-7115
> URL: https://issues.apache.org/jira/browse/HIVE-7115
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure, Tests
>Reporter: jay vyas
>
> Mapreduce has a "local" mode by default, and likewise, tools such as pig and 
> SOLR do as well, maybe we can have a first class local mode for hive 
> also. 
> For local integration testing of a hadoop app, it would be nice if we could 
> fire up a local hive instance which didnt require "bin/hadoop" for running 
> local jobs.  This would allow us to maintain polyglot hadoop applications 
> much easier by incorporating hive into the integration tests.  For example:
> {noformat}
> LocalHiveInstance hive = new LocalHiveInstance();
> hive.set("course","crochet")l
> hive.runScript("hive_flow.ql")l
> {noformat} 
> Would essentially run a local hive query which mirrors
> {noformat}
> hive -f hive_flow.ql -hiveconf course=crochet
> {noformat{ 
> It seems like thee might be a simple way to do this, at least for small data 
> sets, by putting some kind of alternative (i.e. in memory) execution 
> environment under hive, if one is not already underway ?  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7192) Hive Streaming - Some required settings are not mentioned in the documentation

2014-06-09 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7192:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

> Hive Streaming - Some required settings are not mentioned in the documentation
> --
>
> Key: HIVE-7192
> URL: https://issues.apache.org/jira/browse/HIVE-7192
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: Streaming
> Fix For: 0.14.0
>
> Attachments: HIVE-7192.patch
>
>
> Specifically:
>  - hive.support.concurrency on metastore
>  - hive.vectorized.execution.enabled for query client



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7192) Hive Streaming - Some required settings are not mentioned in the documentation

2014-06-09 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025714#comment-14025714
 ] 

Sushanth Sowmyan commented on HIVE-7192:


+1, doc fix, the test failures are unrelated, change looks okay to me. Will go 
ahead and commit.

> Hive Streaming - Some required settings are not mentioned in the documentation
> --
>
> Key: HIVE-7192
> URL: https://issues.apache.org/jira/browse/HIVE-7192
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: Streaming
> Attachments: HIVE-7192.patch
>
>
> Specifically:
>  - hive.support.concurrency on metastore
>  - hive.vectorized.execution.enabled for query client



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7155) WebHCat controller job exceeds container memory limit

2014-06-09 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-7155:


   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks for the contribution Shanyu!


> WebHCat controller job exceeds container memory limit
> -
>
> Key: HIVE-7155
> URL: https://issues.apache.org/jira/browse/HIVE-7155
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: shanyu zhao
>Assignee: shanyu zhao
> Fix For: 0.14.0
>
> Attachments: HIVE-7155.1.patch, HIVE-7155.2.patch, HIVE-7155.patch
>
>
> Submit a Hive query on a large table via WebHCat results in failure because 
> the WebHCat controller job is killed by Yarn since it exceeds the memory 
> limit (set by mapreduce.map.memory.mb, defaults to 1GB):
> {code}
>  INSERT OVERWRITE TABLE Temp_InjusticeEvents_2014_03_01_00_00 SELECT * from 
> Stage_InjusticeEvents where LogTimestamp > '2014-03-01 00:00:00' and 
> LogTimestamp <= '2014-03-01 01:00:00';
> {code}
> We could increase mapreduce.map.memory.mb to solve this problem, but this way 
> we are changing this setting system wise.
> We need to provide a WebHCat configuration to overwrite 
> mapreduce.map.memory.mb when submitting the controller job.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7075) JsonSerde raises NullPointerException when object key is not lower case

2014-06-09 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7075:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

> JsonSerde raises NullPointerException when object key is not lower case
> ---
>
> Key: HIVE-7075
> URL: https://issues.apache.org/jira/browse/HIVE-7075
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.12.0
>Reporter: Yibing Shi
>Assignee: Navis
> Fix For: 0.14.0
>
> Attachments: HIVE-7075.1.patch.txt, HIVE-7075.2.patch.txt, 
> HIVE-7075.3.patch.txt
>
>
> We have noticed that the JsonSerde produces a NullPointerException if a JSON 
> object has a key value that is not lower case. For example. Assume we have 
> the file "one.json": 
> { "empId" : 123, "name" : "John" } 
> { "empId" : 456, "name" : "Jane" } 
> hive> CREATE TABLE emps (empId INT, name STRING) 
> ROW FORMAT SERDE "org.apache.hive.hcatalog.data.JsonSerDe"; 
> hive> LOAD DATA LOCAL INPATH 'one.json' INTO TABLE emps; 
> hive> SELECT * FROM emps; 
> Failed with exception java.io.IOException:java.lang.NullPointerException 
>  
> Notice, it seems to work if the keys are lower case. Assume we have the file 
> 'two.json': 
> { "empid" : 123, "name" : "John" } 
> { "empid" : 456, "name" : "Jane" } 
> hive> DROP TABLE emps; 
> hive> CREATE TABLE emps (empId INT, name STRING) 
> ROW FORMAT SERDE "org.apache.hive.hcatalog.data.JsonSerDe"; 
> hive> LOAD DATA LOCAL INPATH 'two.json' INTO TABLE emps;
> hive> SELECT * FROM emps; 
> OK 
> 123   John 
> 456   Jane



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7075) JsonSerde raises NullPointerException when object key is not lower case

2014-06-09 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025690#comment-14025690
 ] 

Sushanth Sowmyan commented on HIVE-7075:


Looks good to me. +1.

Will go ahead and commit.

> JsonSerde raises NullPointerException when object key is not lower case
> ---
>
> Key: HIVE-7075
> URL: https://issues.apache.org/jira/browse/HIVE-7075
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.12.0
>Reporter: Yibing Shi
>Assignee: Navis
> Attachments: HIVE-7075.1.patch.txt, HIVE-7075.2.patch.txt, 
> HIVE-7075.3.patch.txt
>
>
> We have noticed that the JsonSerde produces a NullPointerException if a JSON 
> object has a key value that is not lower case. For example. Assume we have 
> the file "one.json": 
> { "empId" : 123, "name" : "John" } 
> { "empId" : 456, "name" : "Jane" } 
> hive> CREATE TABLE emps (empId INT, name STRING) 
> ROW FORMAT SERDE "org.apache.hive.hcatalog.data.JsonSerDe"; 
> hive> LOAD DATA LOCAL INPATH 'one.json' INTO TABLE emps; 
> hive> SELECT * FROM emps; 
> Failed with exception java.io.IOException:java.lang.NullPointerException 
>  
> Notice, it seems to work if the keys are lower case. Assume we have the file 
> 'two.json': 
> { "empid" : 123, "name" : "John" } 
> { "empid" : 456, "name" : "Jane" } 
> hive> DROP TABLE emps; 
> hive> CREATE TABLE emps (empId INT, name STRING) 
> ROW FORMAT SERDE "org.apache.hive.hcatalog.data.JsonSerDe"; 
> hive> LOAD DATA LOCAL INPATH 'two.json' INTO TABLE emps;
> hive> SELECT * FROM emps; 
> OK 
> 123   John 
> 456   Jane



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set

2014-06-09 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025678#comment-14025678
 ] 

Naveen Gangam commented on HIVE-7200:
-

Code review published to Review Boards @
https://reviews.apache.org/r/22396/


> Beeline output displays column heading even if --showHeader=false is set
> 
>
> Key: HIVE-7200
> URL: https://issues.apache.org/jira/browse/HIVE-7200
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7200.1.patch
>
>
> A few minor/cosmetic issues with the beeline CLI.
> 1) Tool prints the column headers despite setting the --showHeader to false. 
> This property only seems to affect the subsequent header information that 
> gets printed based on the value of property "headerInterval" (default value 
> is 100).
> 2) When "showHeader" is true & "headerInterval > 0", the header after the 
> first interval gets printed after  rows. The code seems 
> to count the initial header as a row, if you will.
> 3) The table footer(the line that closes the table) does not get printed if 
> the "showHeader" is false. I think the table should get closed irrespective 
> of whether it prints the header or not.
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> +--+
> 6 rows selected (3.998 seconds)
> 0: jdbc:hive2://localhost:1> !set headerInterval 2
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> +--+
> | val  |
> +--+
> | f|
> | T|
> +--+
> | val  |
> +--+
> | F|
> | 0|
> +--+
> | val  |
> +--+
> | 1|
> +--+
> 6 rows selected (0.691 seconds)
> 0: jdbc:hive2://localhost:1> !set showHeader false
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> 6 rows selected (1.728 seconds)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set

2014-06-09 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-7200:


Fix Version/s: 0.14.0
   Status: Patch Available  (was: Open)

> Beeline output displays column heading even if --showHeader=false is set
> 
>
> Key: HIVE-7200
> URL: https://issues.apache.org/jira/browse/HIVE-7200
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7200.1.patch
>
>
> A few minor/cosmetic issues with the beeline CLI.
> 1) Tool prints the column headers despite setting the --showHeader to false. 
> This property only seems to affect the subsequent header information that 
> gets printed based on the value of property "headerInterval" (default value 
> is 100).
> 2) When "showHeader" is true & "headerInterval > 0", the header after the 
> first interval gets printed after  rows. The code seems 
> to count the initial header as a row, if you will.
> 3) The table footer(the line that closes the table) does not get printed if 
> the "showHeader" is false. I think the table should get closed irrespective 
> of whether it prints the header or not.
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> +--+
> 6 rows selected (3.998 seconds)
> 0: jdbc:hive2://localhost:1> !set headerInterval 2
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> +--+
> | val  |
> +--+
> | f|
> | T|
> +--+
> | val  |
> +--+
> | F|
> | 0|
> +--+
> | val  |
> +--+
> | 1|
> +--+
> 6 rows selected (0.691 seconds)
> 0: jdbc:hive2://localhost:1> !set showHeader false
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> 6 rows selected (1.728 seconds)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set

2014-06-09 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-7200:


Attachment: HIVE-7200.1.patch

0: jdbc:hive2://localhost:1> select * from stringvals;
+--+
| val  |
+--+
| t|
| f|
| T|
| F|
| 0|
| 1|
+--+
6 rows selected (2.052 seconds)
0: jdbc:hive2://localhost:1> !set showHeader false
0: jdbc:hive2://localhost:1> select * from stringvals;
| t|
| f|
| T|
| F|
| 0|
| 1|
+--+
6 rows selected (2.329 seconds)
0: jdbc:hive2://localhost:1> !set headerInterval 2
0: jdbc:hive2://localhost:1> select * from stringvals;
| t|
| f|
| T|
| F|
| 0|
| 1|
+--+
6 rows selected (1.482 seconds)
0: jdbc:hive2://localhost:1> !set showHeader true 
0: jdbc:hive2://localhost:1> select * from stringvals;
+--+
| val  |
+--+
| t|
| f|
+--+
| val  |
+--+
| T|
| F|
+--+
| val  |
+--+
| 0|
| 1|
+--+
6 rows selected (0.997 seconds)
0: jdbc:hive2://localhost:1> !set headerInterval 5
0: jdbc:hive2://localhost:1> select * from stringvals;
+--+
| val  |
+--+
| t|
| f|
| T|
| F|
| 0|
+--+
| val  |
+--+
| 1|
+--+
6 rows selected (0.822 seconds)
0: jdbc:hive2://localhost:1> !set headerInterval 50   
0: jdbc:hive2://localhost:1> select * from stringvals;
+--+
| val  |
+--+
| t|
| f|
| T|
| F|
| 0|
| 1|
+--+
6 rows selected (0.764 seconds)

The test results with the patch above.

> Beeline output displays column heading even if --showHeader=false is set
> 
>
> Key: HIVE-7200
> URL: https://issues.apache.org/jira/browse/HIVE-7200
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-7200.1.patch
>
>
> A few minor/cosmetic issues with the beeline CLI.
> 1) Tool prints the column headers despite setting the --showHeader to false. 
> This property only seems to affect the subsequent header information that 
> gets printed based on the value of property "headerInterval" (default value 
> is 100).
> 2) When "showHeader" is true & "headerInterval > 0", the header after the 
> first interval gets printed after  rows. The code seems 
> to count the initial header as a row, if you will.
> 3) The table footer(the line that closes the table) does not get printed if 
> the "showHeader" is false. I think the table should get closed irrespective 
> of whether it prints the header or not.
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> +--+
> 6 rows selected (3.998 seconds)
> 0: jdbc:hive2://localhost:1> !set headerInterval 2
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> +--+
> | val  |
> +--+
> | f|
> | T|
> +--+
> | val  |
> +--+
> | F|
> | 0|
> +--+
> | val  |
> +--+
> | 1|
> +--+
> 6 rows selected (0.691 seconds)
> 0: jdbc:hive2://localhost:1> !set showHeader false
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> 6 rows selected (1.728 seconds)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set

2014-06-09 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025653#comment-14025653
 ] 

Naveen Gangam commented on HIVE-7200:
-

A few cosmetic issues here

1) 
http://sqlline.sourceforge.net/#setting_showheader
http://sqlline.sourceforge.net/#setting_headerinterval

Although these properties are loosely defined in the SQLLine documentation 
above, it makes sense to assume showHeader is for all headers, not just 
subsequent ones.

2) headerInterval causes the header to be printed 1 row sooner the first time 
ONLY. The code suggests that it is including the header information as the 
first row, which is wrong, semantically speaking. From then on, the header is 
printed at the set "headerInterval" number of rows after.

3) The line that closes the table ("--" in this case) at the bottom, is 
also dependent on whether or not --showHeader is set to true. I believe that is 
incorrect too.

> Beeline output displays column heading even if --showHeader=false is set
> 
>
> Key: HIVE-7200
> URL: https://issues.apache.org/jira/browse/HIVE-7200
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
>
> A few minor/cosmetic issues with the beeline CLI.
> 1) Tool prints the column headers despite setting the --showHeader to false. 
> This property only seems to affect the subsequent header information that 
> gets printed based on the value of property "headerInterval" (default value 
> is 100).
> 2) When "showHeader" is true & "headerInterval > 0", the header after the 
> first interval gets printed after  rows. The code seems 
> to count the initial header as a row, if you will.
> 3) The table footer(the line that closes the table) does not get printed if 
> the "showHeader" is false. I think the table should get closed irrespective 
> of whether it prints the header or not.
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> +--+
> 6 rows selected (3.998 seconds)
> 0: jdbc:hive2://localhost:1> !set headerInterval 2
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> +--+
> | val  |
> +--+
> | f|
> | T|
> +--+
> | val  |
> +--+
> | F|
> | 0|
> +--+
> | val  |
> +--+
> | 1|
> +--+
> 6 rows selected (0.691 seconds)
> 0: jdbc:hive2://localhost:1> !set showHeader false
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> 6 rows selected (1.728 seconds)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-3508) create cube and rollup operators in hive without mapside aggregation

2014-06-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025650#comment-14025650
 ] 

Hive QA commented on HIVE-3508:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12648932/HIVE-3508.patch.txt

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/415/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/415/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-415/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-415/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-minikdc/target itests/hive-unit/target 
itests/custom-serde/target itests/util/target hcatalog/target 
hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target 
hwi/target common/target common/src/gen contrib/target service/target 
serde/target beeline/target odbc/target cli/target 
ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1601492.

At revision 1601492.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12648932

> create cube and rollup operators in hive without mapside aggregation
> 
>
> Key: HIVE-3508
> URL: https://issues.apache.org/jira/browse/HIVE-3508
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
> Attachments: HIVE-3508.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7079) Hive logs errors about missing tables when parsing CTE expressions

2014-06-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025645#comment-14025645
 ] 

Hive QA commented on HIVE-7079:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12648927/HIVE-7079.1.patch.txt

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 5607 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union31
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_joins
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY
org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/414/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/414/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-414/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12648927

> Hive logs errors about missing tables when parsing CTE expressions
> --
>
> Key: HIVE-7079
> URL: https://issues.apache.org/jira/browse/HIVE-7079
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.0
>Reporter: Craig Condit
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7079.1.patch.txt
>
>
> Given a query containing common table expressions (CTE) such as:
> WITH a AS (SELECT ...), b AS (SELECT ...)
> SELECT * FROM a JOIN b on a.col = b.col ...;
> Hive CLI executes the query, but logs stack traces at ERROR level during 
> query parsing:
> {noformat}
> ERROR metadata.Hive: NoSuchObjectException(message:ccondit.a table not found)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29338)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29306)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result.read(ThriftHiveMetastore.java:29237)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1036)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1022)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>   at com.sun.proxy.$Proxy7.getTable(Unknown Source)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:967)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:909)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1223)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1192)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9209)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:391)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:291)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:944)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1009)
>   at org.a

[jira] [Assigned] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set

2014-06-09 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-7200:
---

Assignee: Naveen Gangam

> Beeline output displays column heading even if --showHeader=false is set
> 
>
> Key: HIVE-7200
> URL: https://issues.apache.org/jira/browse/HIVE-7200
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
>
> A few minor/cosmetic issues with the beeline CLI.
> 1) Tool prints the column headers despite setting the --showHeader to false. 
> This property only seems to affect the subsequent header information that 
> gets printed based on the value of property "headerInterval" (default value 
> is 100).
> 2) When "showHeader" is true & "headerInterval > 0", the header after the 
> first interval gets printed after  rows. The code seems 
> to count the initial header as a row, if you will.
> 3) The table footer(the line that closes the table) does not get printed if 
> the "showHeader" is false. I think the table should get closed irrespective 
> of whether it prints the header or not.
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> +--+
> 6 rows selected (3.998 seconds)
> 0: jdbc:hive2://localhost:1> !set headerInterval 2
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> +--+
> | val  |
> +--+
> | f|
> | T|
> +--+
> | val  |
> +--+
> | F|
> | 0|
> +--+
> | val  |
> +--+
> | 1|
> +--+
> 6 rows selected (0.691 seconds)
> 0: jdbc:hive2://localhost:1> !set showHeader false
> 0: jdbc:hive2://localhost:1> select * from stringvals;
> +--+
> | val  |
> +--+
> | t|
> | f|
> | T|
> | F|
> | 0|
> | 1|
> 6 rows selected (1.728 seconds)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set

2014-06-09 Thread Naveen Gangam (JIRA)

Naveen Gangam created HIVE-7200:
---

 Summary: Beeline output displays column heading even if 
--showHeader=false is set
 Key: HIVE-7200
 URL: https://issues.apache.org/jira/browse/HIVE-7200
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.0
Reporter: Naveen Gangam
Priority: Minor


A few minor/cosmetic issues with the beeline CLI.
1) Tool prints the column headers despite setting the --showHeader to false. 
This property only seems to affect the subsequent header information that gets 
printed based on the value of property "headerInterval" (default value is 100).
2) When "showHeader" is true & "headerInterval > 0", the header after the first 
interval gets printed after  rows. The code seems to count 
the initial header as a row, if you will.
3) The table footer(the line that closes the table) does not get printed if the 
"showHeader" is false. I think the table should get closed irrespective of 
whether it prints the header or not.

0: jdbc:hive2://localhost:1> select * from stringvals;
+--+
| val  |
+--+
| t|
| f|
| T|
| F|
| 0|
| 1|
+--+
6 rows selected (3.998 seconds)
0: jdbc:hive2://localhost:1> !set headerInterval 2
0: jdbc:hive2://localhost:1> select * from stringvals;
+--+
| val  |
+--+
| t|
+--+
| val  |
+--+
| f|
| T|
+--+
| val  |
+--+
| F|
| 0|
+--+
| val  |
+--+
| 1|
+--+
6 rows selected (0.691 seconds)
0: jdbc:hive2://localhost:1> !set showHeader false
0: jdbc:hive2://localhost:1> select * from stringvals;
+--+
| val  |
+--+
| t|
| f|
| T|
| F|
| 0|
| 1|
6 rows selected (1.728 seconds)





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025627#comment-14025627
 ] 

Eugene Koifman commented on HIVE-7065:
--

What's strange is that that is the test added for this this ticket.

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025613#comment-14025613
 ] 

Eugene Koifman commented on HIVE-7065:
--

java.lang.IllegalArgumentException: Illegal escaped string 
hive.some.fake.path=C:\foo\bar.txt\ unescaped \ 
  at 22at 
org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:565)
  at org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:547)
  at org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:533)
  at 
org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing(TestTempletonUtils.java:308)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7175) Provide password file option to beeline

2014-06-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025605#comment-14025605
 ] 

Xuefu Zhang commented on HIVE-7175:
---

[~dr.wendell.urth] Thanks for working on this. Could you please provide a 
review board entry for the patch?

> Provide password file option to beeline
> ---
>
> Key: HIVE-7175
> URL: https://issues.apache.org/jira/browse/HIVE-7175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 0.13.0
>Reporter: Robert Justice
>  Labels: features, security
> Attachments: HIVE-7175.patch
>
>
> For people connecting to Hive Server 2 with LDAP authentication enabled, in 
> order to batch run commands, we currently have to provide the password openly 
> in the command line.   They could use some expect scripting, but I think a 
> valid improvement would be to provide a password file option similar to other 
> CLI commands in hadoop (e.g. sqoop) to be more secure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7199) Cannot alter table to parquet

2014-06-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025602#comment-14025602
 ] 

Xuefu Zhang commented on HIVE-7199:
---

+1 pending on test result.

> Cannot alter table to parquet
> -
>
> Key: HIVE-7199
> URL: https://issues.apache.org/jira/browse/HIVE-7199
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Vasanth kumar RJ
>Assignee: Vasanth kumar RJ
> Fix For: 0.14.0
>
> Attachments: HIVE-7199.patch
>
>
> Cannot able to alter a table to parquet.
> >alter table t1 set fileformat parquet;
> Then cannot able to query to the table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7199) Cannot alter table to parquet

2014-06-09 Thread Vasanth kumar RJ (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasanth kumar RJ updated HIVE-7199:
---

Fix Version/s: 0.14.0
   Status: Patch Available  (was: Open)

> Cannot alter table to parquet
> -
>
> Key: HIVE-7199
> URL: https://issues.apache.org/jira/browse/HIVE-7199
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1, 0.14.0
>Reporter: Vasanth kumar RJ
>Assignee: Vasanth kumar RJ
> Fix For: 0.14.0
>
> Attachments: HIVE-7199.patch
>
>
> Cannot able to alter a table to parquet.
> >alter table t1 set fileformat parquet;
> Then cannot able to query to the table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7199) Cannot alter table to parquet

2014-06-09 Thread Vasanth kumar RJ (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasanth kumar RJ updated HIVE-7199:
---

Attachment: HIVE-7199.patch

> Cannot alter table to parquet
> -
>
> Key: HIVE-7199
> URL: https://issues.apache.org/jira/browse/HIVE-7199
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Vasanth kumar RJ
>Assignee: Vasanth kumar RJ
> Attachments: HIVE-7199.patch
>
>
> Cannot able to alter a table to parquet.
> >alter table t1 set fileformat parquet;
> Then cannot able to query to the table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7199) Cannot alter table to parquet

2014-06-09 Thread Vasanth kumar RJ (JIRA)

Vasanth kumar RJ created HIVE-7199:
--

 Summary: Cannot alter table to parquet
 Key: HIVE-7199
 URL: https://issues.apache.org/jira/browse/HIVE-7199
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1, 0.14.0
Reporter: Vasanth kumar RJ
Assignee: Vasanth kumar RJ


Cannot able to alter a table to parquet.

>alter table t1 set fileformat parquet;
Then cannot able to query to the table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Parquet Column Index Access in Hive

2014-06-09 Thread Daniel Weeks

Brock,

Could someone looking to reviewing this?  We're hoping to get parquet
rolled out internally and this is a pretty important feature for us.

Thanks,
-Dan


On Mon, May 19, 2014 at 10:50 AM, Daniel Weeks  wrote:

> No, my test passed and I don't think any of the others are related.
>
> -Dan
>
>
> On Mon, May 19, 2014 at 10:44 AM, Brock Noland  wrote:
>
>> Hi,
>>
>> Did any of your tests fail? If not, then we should be able to review.
>>
>> Brock
>>
>>
>> On Mon, May 19, 2014 at 12:12 PM, Daniel Weeks 
>> wrote:
>>
>>> Brock,
>>>
>>> I'm not sure where we stand at this point.  Do I need to resubmit after
>>> the problems with Java 7 are worked through or is it ok to leave it in its
>>> current state?
>>>
>>> Thanks,
>>> Dan
>>>
>>>
>>> On Fri, May 16, 2014 at 10:41 AM, Brock Noland 
>>> wrote:
>>>
 I believe the results for the latest patch have just been posted.
 You'll see a bunch of unrelated failures since we just switched to Java 7.


 On Fri, May 16, 2014 at 12:39 PM, Daniel Weeks 
 wrote:

> I updated the patch on the JIRA ticket, but the Hive QA hasn't
> triggered yet.  I had problems with this previously and was just wondering
> if I hit the same issue again.
>
> Thanks,
>  Dan
>
>
> On Tue, May 13, 2014 at 4:21 PM, Xuefu Zhang 
> wrote:
>
>> Yeah. I saw some doubts on HIVE-6936 as well. Not sure whether or
>> when it can get thru. I'm fine with the global config approach, which, 
>> once
>> in place, will probably stay unless it's changed before it's released.
>>
>>
>> On Tue, May 13, 2014 at 4:07 PM, Daniel Weeks 
>> wrote:
>>
>>> That would be nice, but I didn't see a lot of movement on that issue
>>> in the last few weeks.  Since the parquet integration can be done in two
>>> steps, it isn't really dependent on 6936 for the many who want to use
>>> column based index as the default.
>>>
>>> Any idea what the timeline is for 6936?  Is this even a priority?
>>>
>>> Thanks,
>>> Dan
>>>
>>>
>>> On Tue, May 13, 2014 at 3:43 PM, Xuefu Zhang 
>>> wrote:
>>>
 I actually meant pushing
 https://issues.apache.org/jira/browse/HIVE-6936 forward first.



 On Tue, May 13, 2014 at 3:41 PM, Xuefu Zhang 
 wrote:

> Thanks, Daniel. It might be better if we can push HIVE-6938
> forward so that we can do it once for all. It's hard to remove a 
> config
> once being released.
>
> --Xuefu
>
>
> On Tue, May 13, 2014 at 2:59 PM, Daniel Weeks 
> wrote:
>
>> I've updated the patch for HIVE-6938
>>  to be a global
>> setting (maintaining the default behavior for existing parquet-hive 
>> users).
>>  When HIVE-6936 gets sorted out and a path is determined for 
>> exposing table
>> properties to input formats, I'll update to also allow a table level 
>> switch.
>>
>> Thanks,
>> -Dan
>>
>>
>> On Tue, May 13, 2014 at 12:28 PM, Daniel Weeks <
>> dwe...@netflix.com> wrote:
>>
>>> Xuefu,
>>>
>>> Unfortunately, parquet can't simply try by name and fallback to
>>> index.  The two approaches are orthogonal and mixing modes can 
>>> cause all
>>> sorts of problems.  You can read a little more about the various 
>>> access
>>> schemes here:
>>> https://github.com/Parquet/parquet-format/issues/91
>>>
>>> The JIRA you indicated is exactly what we need to make this
>>> configurable at the table level.
>>>
>>> I can modify my patch to use the global setting and ignore the
>>> table setting until 6936 is resolved.
>>>
>>> Thanks,
>>> Dan
>>>
>>>
>>> On Tue, May 13, 2014 at 10:59 AM, Xuefu Zhang <
>>> xzh...@cloudera.com> wrote:
>>>
 My preference is less configurations. Could parquet first
 access by name, and retry by index upon failure? As long as we 
 clearly
 document the behavior, we should be Okay.

 If configuration turns out to be most viable, Does this help in
 any way? https://issues.apache.org/jira/browse/HIVE-6936

 Thanks,
 Xuefu


 On Mon, May 12, 2014 at 7:20 PM, Brock Noland <
 br...@cloudera.com> wrote:

> Hi Daniel,
>
> Thank you for the information. Nong, Szehon, or Xeufu, do you
> have any thoughts on this? If we are going to have a global flag, 
> my
> thought would be default this to on.
>>>

[jira] [Commented] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat

2014-06-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025556#comment-14025556
 ] 

Eugene Koifman commented on HIVE-6226:
--

[~leftylev]
Here is an example:
http://localhost:50111/templeton/v1/version/hive?user.name=ekoifman

which returns:

{"module":"hive","version":"0.14.0-SNAPSHOT"}


http://localhost:50111/templeton/v1/version/hadoop?user.name=ekoifman
returns:
{"module":"hadoop","version":"2.4.1-SNAPSHOT"}


http://localhost:50111/templeton/v1/version/pig?user.name=ekoifman and 
http://localhost:50111/templeton/v1/version/sqoop?user.name=ekoifman are both 
there as well, but will return
{"error":"Pig version request not yet implemented"}
So the last 2 are not really implemented, so I'm not sure they should be 
documented.



> It should be possible to get hadoop, hive, and pig version being used by 
> WebHCat
> 
>
> Key: HIVE-6226
> URL: https://issues.apache.org/jira/browse/HIVE-6226
> Project: Hive
>  Issue Type: New Feature
>  Components: WebHCat
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.13.0
>
> Attachments: HIVE-6226.2.patch, HIVE-6226.patch
>
>
> Calling /version on WebHCat tells the caller the protocol verison, but there 
> is no way to determine the versions of software being run by the applications 
> that WebHCat spawns.  
> I propose to add an end-point: /version/\{module\} where module could be pig, 
> hive, or hadoop.  The response will then be:
> {code}
> {
>   "module" : _module_name_,
>   "version" : _version_string_
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-1434) Cassandra Storage Handler

2014-06-09 Thread Edward Capriolo (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo resolved HIVE-1434.
---

Resolution: Won't Fix

> Cassandra Storage Handler
> -
>
> Key: HIVE-1434
> URL: https://issues.apache.org/jira/browse/HIVE-1434
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.7.0
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
> Attachments: HIVE-1434-r1182878.patch, cas-handle.tar.gz, 
> cass_handler.diff, hive-1434-1.txt, hive-1434-2-patch.txt, 
> hive-1434-2011-02-26.patch.txt, hive-1434-2011-03-07.patch.txt, 
> hive-1434-2011-03-07.patch.txt, hive-1434-2011-03-14.patch.txt, 
> hive-1434-3-patch.txt, hive-1434-4-patch.txt, hive-1434-5.patch.txt, 
> hive-1434.2011-02-27.diff.txt, hive-cassandra.2011-02-25.txt, hive.diff
>
>
> Add a cassandra storage handler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-1434) Cassandra Storage Handler

2014-06-09 Thread Edward Capriolo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025539#comment-14025539
 ] 

Edward Capriolo commented on HIVE-1434:
---

This feature is a complete utter failure. It was never committed to hive. It 
was never committed to cassandra. I find ~40 forks of the code that are likely 
derivative works that make no reference to me or hive and all types of people 
are now asserting copyright over it. I am closing this issue and making a clean 
room implementation of a new handler.

> Cassandra Storage Handler
> -
>
> Key: HIVE-1434
> URL: https://issues.apache.org/jira/browse/HIVE-1434
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.7.0
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
> Attachments: HIVE-1434-r1182878.patch, cas-handle.tar.gz, 
> cass_handler.diff, hive-1434-1.txt, hive-1434-2-patch.txt, 
> hive-1434-2011-02-26.patch.txt, hive-1434-2011-03-07.patch.txt, 
> hive-1434-2011-03-07.patch.txt, hive-1434-2011-03-14.patch.txt, 
> hive-1434-3-patch.txt, hive-1434-4-patch.txt, hive-1434-5.patch.txt, 
> hive-1434.2011-02-27.diff.txt, hive-cassandra.2011-02-25.txt, hive.diff
>
>
> Add a cassandra storage handler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6394) Implement Timestmap in ParquetSerde

2014-06-09 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025534#comment-14025534
 ] 

Brock Noland commented on HIVE-6394:


[~szehon] I see parquet_timestamp failed.

> Implement Timestmap in ParquetSerde
> ---
>
> Key: HIVE-6394
> URL: https://issues.apache.org/jira/browse/HIVE-6394
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Jarek Jarcec Cecho
>Assignee: Szehon Ho
>  Labels: Parquet
> Attachments: HIVE-6394.2.patch, HIVE-6394.3.patch, HIVE-6394.4.patch, 
> HIVE-6394.5.patch, HIVE-6394.6.patch, HIVE-6394.patch
>
>
> This JIRA is to implement timestamp support in Parquet SerDe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025480#comment-14025480
 ] 

Eugene Koifman commented on HIVE-7065:
--

[~leftylev] Is there a way to make this table in the wiki be autogenerated from 
the webhcat-default.xml?  It would ensure there is a single source of truth. 
Tez was shipped in 0.13, so yes I think hive.execution.engine can be mentioned 
for 0.13.

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025469#comment-14025469
 ] 

Eugene Koifman commented on HIVE-7065:
--

good point, this should have pre-commit tests


> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Description: 
WebHCat config has templeton.hive.properties to specify Hive config properties 
that need to be passed to Hive client on node executing a job submitted through 
WebHCat (hive query, for example).

this should include "hive.execution.engine"


  was:
WebHCat config has templeton.hive.properties to specify Hive config properties 
that need to be passed to Hive client on node executing a job submitted through 
WebHCat (hive query, for example).

this should include "hive.execution.engine"

NO PRECOMMIT TESTS


> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Case sensitive column names during HIVE table creation

2014-06-09 Thread Divya Ravishankar

Hello,

I am trying to create an external HIVE table with partitions. Some of my
column names have Upper case letters. This caused a problem while creating
tables since the values of column names with upper case letters were
returned as NULL. I then modified the ParquetSerDe in order for it to
handle this by using SERDEPROPERTIES and this was working now. Now I am
trying to create an external table with partitions, whenever I try to
access the upper case columns (Eg FieldName) I get this error.
select FieldName from tablename;
FAILED: RuntimeException Java. Lang.RuntimeException: cannot find field
FieldName from [org.apache.hadoop.hive.serde2.objectinspector.
UnionStructObjectInspector$MyField@4f45884b, org.apache.hadoop.hive.serde2.
objectinspector.UnionStructObjectInspector$MyField@8f11f27, org.apache.
hadoop.hive.serde2.objectinspector.
UnionStructObjectInspector$MyField@77e8eb0e, org.apache.hadoop.hive.serde2.
objectinspector.UnionStructObjectInspector$MyField@1dae4cd, org.apache.
hadoop.hive.serde2.objectinspector.
UnionStructObjectInspector$MyField@623e336d]

Are there any suggestions you can think of?

This is the command I use to create tables -

CREATE EXTERNAL TABLE tablename (fieldname string)

PARTITIONED BY (partion_name string)

ROW FORMAT SERDE 'path.ModifiedParquetSerDeLatest'

WITH SERDEPROPERTIES ("casesensitive"="FieldName")

STORED AS INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'

OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
location 'path to location';

Best,
Divya Ravishankar

[jira] [Commented] (HIVE-2365) SQL support for bulk load into HBase

2014-06-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025397#comment-14025397
 ] 

Ashutosh Chauhan commented on HIVE-2365:


Move moveTask logic to OutputCommitter of HiveHFileOutputFormat. In 
HadoopShims::prepareJobOutput() method detect that its doing bulk load in HBase 
and set OC to HiveHFileOutputCommitter instead of NullOutputCommitter.

> SQL support for bulk load into HBase
> 
>
> Key: HIVE-2365
> URL: https://issues.apache.org/jira/browse/HIVE-2365
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: John Sichi
>Assignee: Nick Dimiduk
> Fix For: 0.14.0
>
> Attachments: HIVE-2365.2.patch.txt, HIVE-2365.3.patch, 
> HIVE-2365.3.patch, HIVE-2365.WIP.00.patch, HIVE-2365.WIP.01.patch, 
> HIVE-2365.WIP.01.patch
>
>
> Support the "as simple as this" SQL for bulk load from Hive into HBase.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7196) Configure session by single open session call

2014-06-09 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025395#comment-14025395
 ] 

Vaibhav Gumashta commented on HIVE-7196:


+1

> Configure session by single open session call
> -
>
> Key: HIVE-7196
> URL: https://issues.apache.org/jira/browse/HIVE-7196
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-7196.1.patch.txt
>
>
> Currently, jdbc2 connection executes set command for each conf/vars, which 
> can be embedded in TOpenSessionReq.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 22370: Configure session by single open session call

2014-06-09 Thread Vaibhav Gumashta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22370/#review45094
---

Ship it!


Ship It!

- Vaibhav Gumashta


On June 9, 2014, 9 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22370/
> ---
> 
> (Updated June 9, 2014, 9 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-7196
> https://issues.apache.org/jira/browse/HIVE-7196
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently, jdbc2 connection executes set command for each conf/vars, which 
> can be embedded in TOpenSessionReq.
> 
> 
> Diffs
> -
> 
>   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
> 192ee6b 
>   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 8595677 
>   jdbc/src/java/org/apache/hive/jdbc/Utils.java 87fec11 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java 92d5e75 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> a9d5902 
> 
> Diff: https://reviews.apache.org/r/22370/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>

[jira] [Commented] (HIVE-7159) For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition

2014-06-09 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025382#comment-14025382
 ] 

Gopal V commented on HIVE-7159:
---

[~ashutoshc]: patch applies, does not build anymore.

The patch relied on genJoinReduceSinkChild(QB qb, QBJoinTree joinTree, ... pos) 
as the signature for the method it was in.

QBJoinTree & pos have been removed.

> For inner joins push a 'is not null predicate' to the join sources for every 
> non nullSafe join condition
> 
>
> Key: HIVE-7159
> URL: https://issues.apache.org/jira/browse/HIVE-7159
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Butani
>Assignee: Harish Butani
> Attachments: HIVE-7159.1.patch
>
>
> A join B on A.x = B.y
> can be transformed to
> (A where x is not null) join (B where y is not null) on A.x = B.y
> Apart from avoiding shuffling null keyed rows it also avoids issues with 
> reduce-side skew when there are a lot of null values in the data.
> Thanks to [~gopalv] for the analysis and coming up with the solution.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table

2014-06-09 Thread Nick Dimiduk (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025376#comment-14025376
 ] 

Nick Dimiduk commented on HIVE-6473:


Made a small change. Thanks Sushanth!

> Allow writing HFiles via HBaseStorageHandler table
> --
>
> Key: HIVE-6473
> URL: https://issues.apache.org/jira/browse/HIVE-6473
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 0.14.0
>
> Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, 
> HIVE-6473.1.patch.txt, HIVE-6473.2.patch, HIVE-6473.3.patch, 
> HIVE-6473.4.patch, HIVE-6473.5.patch, HIVE-6473.6.patch
>
>
> Generating HFiles for bulkload into HBase could be more convenient. Right now 
> we require the user to register a new table with the appropriate output 
> format. This patch allows the exact same functionality, but through an 
> existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table

2014-06-09 Thread Nick Dimiduk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-6473:
---

Release Note: 
Allows direct creation of HFiles and location for them as part of 
HBaseStorageHandler write if the following properties are specified in the HQL:

set hive.hbase.generatehfiles=true;
set hfile.family.path=/tmp/columnfamily_name;

hfile.family.path can also be set as a table property, HQL value takes 
precedence.

  was:
Allows direct creation of HFiles and location for them as part of 
HBaseStorageHandler write if the following properties are specified in the HQL:

set hive.hbase.generatehfiles=true;
set hfile.family.path=/tmp/columnfamily_name;


> Allow writing HFiles via HBaseStorageHandler table
> --
>
> Key: HIVE-6473
> URL: https://issues.apache.org/jira/browse/HIVE-6473
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 0.14.0
>
> Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, 
> HIVE-6473.1.patch.txt, HIVE-6473.2.patch, HIVE-6473.3.patch, 
> HIVE-6473.4.patch, HIVE-6473.5.patch, HIVE-6473.6.patch
>
>
> Generating HFiles for bulkload into HBase could be more convenient. Right now 
> we require the user to register a new table with the appropriate output 
> format. This patch allows the exact same functionality, but through an 
> existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Case sensitive column names during HIVE table creation

2014-06-09 Thread Divya Ravishankar

Hello,

I am trying to create an external HIVE table with partitions. Some of my
column names have Upper case letters. This caused a problem while creating
tables since the values of column names with upper case letters were
returned as NULL. I then modified the ParquetSerDe in order for it to
handle this by using SERDEPROPERTIES and this was working now. Now I am
trying to create an external table with partitions, whenever I try to
access the upper case columns (Eg FieldName) I get this error.
select FieldName from tablename;
FAILED: RuntimeException Java. Lang.RuntimeException: cannot find field
FieldName from
[org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@4f45884b,
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@8f11f27,
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@77e8eb0e,
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@1dae4cd,
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@623e336d
]

Are there any suggestions you can think of?

This is the command I use to create tables -

CREATE EXTERNAL TABLE tablename (fieldname string)

PARTITIONED BY (partion_name string)

ROW FORMAT SERDE 'path.ModifiedParquetSerDeLatest'

WITH SERDEPROPERTIES ("casesensitive"="FieldName")

STORED AS INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'

OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
location 'path to location';

Best,
Divya Ravishankar

[jira] [Updated] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table

2014-06-09 Thread Nick Dimiduk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-6473:
---

Release Note: 
Allows direct creation of HFiles and location for them as part of 
HBaseStorageHandler write if the following properties are specified in the HQL:

set hive.hbase.generatehfiles=true;
set hfile.family.path=/tmp/columnfamily_name;

  was:
Allows direct creation of HFiles and location for them as part of 
HBaseStorageHandler write if the following properties are specified in the HQL:

set hive.hbase.generatehfiles=true;
set hfile.family.path=/tmp/hfilelocn;


> Allow writing HFiles via HBaseStorageHandler table
> --
>
> Key: HIVE-6473
> URL: https://issues.apache.org/jira/browse/HIVE-6473
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 0.14.0
>
> Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, 
> HIVE-6473.1.patch.txt, HIVE-6473.2.patch, HIVE-6473.3.patch, 
> HIVE-6473.4.patch, HIVE-6473.5.patch, HIVE-6473.6.patch
>
>
> Generating HFiles for bulkload into HBase could be more convenient. Right now 
> we require the user to register a new table with the appropriate output 
> format. This patch allows the exact same functionality, but through an 
> existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase

2014-06-09 Thread Nick Dimiduk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-2365:
---

Status: Patch Available  (was: Open)

> SQL support for bulk load into HBase
> 
>
> Key: HIVE-2365
> URL: https://issues.apache.org/jira/browse/HIVE-2365
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: John Sichi
>Assignee: Nick Dimiduk
> Fix For: 0.14.0
>
> Attachments: HIVE-2365.2.patch.txt, HIVE-2365.3.patch, 
> HIVE-2365.3.patch, HIVE-2365.WIP.00.patch, HIVE-2365.WIP.01.patch, 
> HIVE-2365.WIP.01.patch
>
>
> Support the "as simple as this" SQL for bulk load from Hive into HBase.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table

2014-06-09 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6473:
---

Release Note: 
Allows direct creation of HFiles and location for them as part of 
HBaseStorageHandler write if the following properties are specified in the HQL:

set hive.hbase.generatehfiles=true;
set hfile.family.path=/tmp/hfilelocn;

  was:
Allows direct creation of HFiles and location for them if the following 
properties are specified in the HQL:

set hive.hbase.generatehfiles=true;
set hfile.family.path=/tmp/hfilelocn;


> Allow writing HFiles via HBaseStorageHandler table
> --
>
> Key: HIVE-6473
> URL: https://issues.apache.org/jira/browse/HIVE-6473
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 0.14.0
>
> Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, 
> HIVE-6473.1.patch.txt, HIVE-6473.2.patch, HIVE-6473.3.patch, 
> HIVE-6473.4.patch, HIVE-6473.5.patch, HIVE-6473.6.patch
>
>
> Generating HFiles for bulkload into HBase could be more convenient. Right now 
> we require the user to register a new table with the appropriate output 
> format. This patch allows the exact same functionality, but through an 
> existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table

2014-06-09 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6473:
---

  Resolution: Fixed
Release Note: 
Allows direct creation of HFiles and location for them if the following 
properties are specified in the HQL:

set hive.hbase.generatehfiles=true;
set hfile.family.path=/tmp/hfilelocn;
  Status: Resolved  (was: Patch Available)

Committed. Thanks, Nick!

Could you please check/edit the Release note for this jira for accuracy?

> Allow writing HFiles via HBaseStorageHandler table
> --
>
> Key: HIVE-6473
> URL: https://issues.apache.org/jira/browse/HIVE-6473
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 0.14.0
>
> Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, 
> HIVE-6473.1.patch.txt, HIVE-6473.2.patch, HIVE-6473.3.patch, 
> HIVE-6473.4.patch, HIVE-6473.5.patch, HIVE-6473.6.patch
>
>
> Generating HFiles for bulkload into HBase could be more convenient. Right now 
> we require the user to register a new table with the appropriate output 
> format. This patch allows the exact same functionality, but through an 
> existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

1 2 >

1 - 100 of 130 matches

Mail list logo