date:20130726


[ 
https://issues.apache.org/jira/browse/HIVE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720484#comment-13720484
 ] 

Phabricator commented on HIVE-4927:
---

ashutoshc has accepted the revision HIVE-4927 [jira] When we merge two MapJoin 
MapRedTasks, the TableScanOperator of the second one should be removed.

  +1

REVISION DETAIL
  https://reviews.facebook.net/D11811

BRANCH
  HIVE-4927

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, yhuai


 When we merge two MapJoin MapRedTasks, the TableScanOperator of the second 
 one should be removed
 

 Key: HIVE-4927
 URL: https://issues.apache.org/jira/browse/HIVE-4927
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4927.D11811.1.patch, HIVE-4927.D11811.2.patch, 
 HIVE-4927.D11811.3.patch, HIVE-4927.D11811.3.patch


 {code}
 set hive.auto.convert.join=true;
 set hive.auto.convert.join.noconditionaltask=true;
 EXPLAIN
 SELECT x1.key AS key FROM src x1 JOIN src1 y1 ON (x1.key = y1.key) JOIN src1 
 y2 ON (x1.value = y2.value) GROUP BY x1.key;
 {\code}
 We will get a NPE from MetadataOnlyOptimizer. The reason is that the operator 
 tree of the MapRedTask evaluating two MapJoins is 
 {code}
 TS1-MapJoin1-TS2-MapJoin2-...
 {\code}
 We should remove the TS2...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4055) add Date data type


[ 
https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720495#comment-13720495
 ] 

Jason Dere commented on HIVE-4055:
--

Looks like I had added both date and timestamp support to RegexSerDe.  I'll 
update the diff accordingly. 
Both Jdbc tests work for me. Well, actually they seem to fail (with or without 
this diff) if they're the first tests I run, but if I run a qfile before 
running the Jdbc tests they seem to pass fine. 

 add Date data type
 --

 Key: HIVE-4055
 URL: https://issues.apache.org/jira/browse/HIVE-4055
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC, Query Processor, Serializers/Deserializers, UDF
Reporter: Sun Rui
Assignee: Jason Dere
 Attachments: Date.pdf, HIVE-4055.1.patch.txt, HIVE-4055.2.patch.txt, 
 HIVE-4055.3.patch.txt, HIVE-4055.D11547.1.patch


 Add Date data type, a new primitive data type which supports the standard SQL 
 date type.
 Basically, the implementation can take HIVE-2272 and HIVE-2957 as references.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4055) add Date data type


 [ 
https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-4055:
-

Attachment: HIVE-4055.4.patch.txt

New patch HIVE-4055.4.patch.txt, with updated unit test output for serde_regex.q

 add Date data type
 --

 Key: HIVE-4055
 URL: https://issues.apache.org/jira/browse/HIVE-4055
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC, Query Processor, Serializers/Deserializers, UDF
Reporter: Sun Rui
Assignee: Jason Dere
 Attachments: Date.pdf, HIVE-4055.1.patch.txt, HIVE-4055.2.patch.txt, 
 HIVE-4055.3.patch.txt, HIVE-4055.4.patch.txt, HIVE-4055.D11547.1.patch


 Add Date data type, a new primitive data type which supports the standard SQL 
 date type.
 Basically, the implementation can take HIVE-2272 and HIVE-2957 as references.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4927) When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed


[ 
https://issues.apache.org/jira/browse/HIVE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720573#comment-13720573
 ] 

Hive QA commented on HIVE-4927:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12594308/HIVE-4927.D11811.3.patch

{color:green}SUCCESS:{color} +1 2652 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/189/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/189/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 When we merge two MapJoin MapRedTasks, the TableScanOperator of the second 
 one should be removed
 

 Key: HIVE-4927
 URL: https://issues.apache.org/jira/browse/HIVE-4927
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4927.D11811.1.patch, HIVE-4927.D11811.2.patch, 
 HIVE-4927.D11811.3.patch, HIVE-4927.D11811.3.patch


 {code}
 set hive.auto.convert.join=true;
 set hive.auto.convert.join.noconditionaltask=true;
 EXPLAIN
 SELECT x1.key AS key FROM src x1 JOIN src1 y1 ON (x1.key = y1.key) JOIN src1 
 y2 ON (x1.value = y2.value) GROUP BY x1.key;
 {\code}
 We will get a NPE from MetadataOnlyOptimizer. The reason is that the operator 
 tree of the MapRedTask evaluating two MapJoins is 
 {code}
 TS1-MapJoin1-TS2-MapJoin2-...
 {\code}
 We should remove the TS2...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4299) exported metadata by HIVE-3068 cannot be imported because of wrong file name


[ 
https://issues.apache.org/jira/browse/HIVE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720604#comment-13720604
 ] 

Hive QA commented on HIVE-4299:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12594323/HIVE-4299.1.patch.txt

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/190/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/190/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests failed with: IllegalStateException: Too many bad hosts: 1.0% (10 / 10) is 
greater than threshold of 50%
{noformat}

This message is automatically generated.

 exported metadata by HIVE-3068 cannot be imported because of wrong file name
 

 Key: HIVE-4299
 URL: https://issues.apache.org/jira/browse/HIVE-4299
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Sho Shimauchi
Assignee: Sho Shimauchi
 Attachments: HIVE-4299.1.patch.txt, HIVE-4299.patch


 h2. Symptom
 When DROP TABLE a table, metadata of the table is generated to be able to 
 import the dropped table again.
 However, the exported metadata name is 'table name.metadata'.
 Since ImportSemanticAnalyzer allows only '_metadata' as metadata filename, 
 user have to rename the metadata file to import the table.
 h2. How to reproduce
 Set the following setting to hive-site.xml:
 {code}
  property
namehive.metastore.pre.event.listeners/name
valueorg.apache.hadoop.hive.ql.parse.MetaDataExportListener/value
  /property
 {code}
 Then run the following queries:
 {code}
  CREATE TABLE test_table (id INT, name STRING);
  DROP TABLE test_table;
  IMPORT TABLE test_table_imported FROM '/path/to/metadata/file';
 FAILED: SemanticException [Error 10027]: Invalid path
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3756) LOAD DATA does not honor permission inheritence

2013-07-26 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720661#comment-13720661
 ] 

Chaoyu Tang commented on HIVE-3756:
---

[~ashutoshc]  [~sushanth] Thanks for the review.

 LOAD DATA does not honor permission inheritence
 -

 Key: HIVE-3756
 URL: https://issues.apache.org/jira/browse/HIVE-3756
 Project: Hive
  Issue Type: Bug
  Components: Authorization, Security
Affects Versions: 0.9.0
Reporter: Johndee Burks
Assignee: Chaoyu Tang
 Fix For: 0.12.0

 Attachments: HIVE-3756_1.patch, HIVE-3756_2.patch, HIVE-3756.patch


 When a LOAD DATA operation is performed the resulting data in hdfs for the 
 table does not maintain permission inheritance. This remains true even with 
 the hive.warehouse.subdir.inherit.perms set to true.
 The issue is easily reproducible by creating a table and loading some data 
 into it. After the load is complete just do a dfs -ls -R on the warehouse 
 directory and you will see that the inheritance of permissions worked for the 
 table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4683) fix coverage org.apache.hadoop.hive.cli

2013-07-26 Thread Aleksey Gorshkov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated HIVE-4683:
---

Status: Patch Available  (was: Open)

 fix coverage org.apache.hadoop.hive.cli
 ---

 Key: HIVE-4683
 URL: https://issues.apache.org/jira/browse/HIVE-4683
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.1, 0.11.1, 0.12.0
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: HIVE-4683-branch-0.10.patch, 
 HIVE-4683-branch-0.10-v1.patch, HIVE-4683-branch-0.11-v1.patch, 
 HIVE-4683-trunk.patch, HIVE-4683-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.

[
https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kousuke Saruta updated HIVE-2137:
-

Description:
JDBC driver decodes string by client side default encoding, which depends on
operating system.
It ignore server side encoding.

For example,
when server side operating system and encoding are Linux (utf-8) and client
side operating system and encoding are Windows (shift-jis : it's japanese
charset, makes character corruption happens in the client.

was:
JDBC driver decode string by client encoding.
It ignore server encoding.

For example,
server = Linux (utf-8)
client = Windows (shift-jis : it's japanese charset)
It makes character corruption in the client.

JDBC driver doesn't encode string properly.
---

Key: HIVE-2137
URL: https://issues.apache.org/jira/browse/HIVE-2137
Project: Hive
Issue Type: Bug
Components: JDBC
Affects Versions: 0.9.0
Reporter: Jin Adachi
Fix For: 0.12.0

Attachments: HIVE-2137.patch

JDBC driver decodes string by client side default encoding, which depends on
operating system.
It ignore server side encoding.
For example,
when server side operating system and encoding are Linux (utf-8) and client
side operating system and encoding are Windows (shift-jis : it's japanese
charset, makes character corruption happens in the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.

[
https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kousuke Saruta updated HIVE-2137:
-

Description:
JDBC driver decodes string by client side default encoding, which depends on
operating system unless we specify .
It ignore server side encoding.

was:
JDBC driver decodes string by client side default encoding, which depends on
operating system.
It ignore server side encoding.

JDBC driver doesn't encode string properly.
---

Key: HIVE-2137
URL: https://issues.apache.org/jira/browse/HIVE-2137
Project: Hive
Issue Type: Bug
Components: JDBC
Affects Versions: 0.9.0
Reporter: Jin Adachi
Fix For: 0.12.0

Attachments: HIVE-2137.patch

JDBC driver decodes string by client side default encoding, which depends on
operating system unless we specify .
It ignore server side encoding.
For example,
when server side operating system and encoding are Linux (utf-8) and client
side operating system and encoding are Windows (shift-jis : it's japanese
charset, makes character corruption happens in the client.

[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.

[
https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kousuke Saruta updated HIVE-2137:
-

Description:
JDBC driver decodes string by client side default encoding, which depends on
operating system unless we don't specify another encoding.
It ignore server side encoding.

was:
JDBC driver decodes string by client side default encoding, which depends on
operating system unless we specify .
It ignore server side encoding.

JDBC driver doesn't encode string properly.
---

Key: HIVE-2137
URL: https://issues.apache.org/jira/browse/HIVE-2137
Project: Hive
Issue Type: Bug
Components: JDBC
Affects Versions: 0.9.0
Reporter: Jin Adachi
Fix For: 0.12.0

Attachments: HIVE-2137.patch

JDBC driver decodes string by client side default encoding, which depends on
operating system unless we don't specify another encoding.
It ignore server side encoding.
For example,
when server side operating system and encoding are Linux (utf-8) and client
side operating system and encoding are Windows (shift-jis : it's japanese
charset, makes character corruption happens in the client.

[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.

[
https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kousuke Saruta updated HIVE-2137:
-

Description:
JDBC driver decodes string by client side default encoding, which depends on
operating system unless we don't specify another encoding.
It ignore server side encoding.

In current implementation of Hive, UTF-8 appears to be expected in server side
so client side should encode/decode string as UTF-8.

was:
JDBC driver decodes string by client side default encoding, which depends on
operating system unless we don't specify another encoding.
It ignore server side encoding.

JDBC driver doesn't encode string properly.
---

Key: HIVE-2137
URL: https://issues.apache.org/jira/browse/HIVE-2137
Project: Hive
Issue Type: Bug
Components: JDBC
Affects Versions: 0.9.0
Reporter: Jin Adachi
Fix For: 0.12.0

Attachments: HIVE-2137.patch

[jira] [Commented] (HIVE-4935) Potential NPE in MetadataOnlyOptimizer


[ 
https://issues.apache.org/jira/browse/HIVE-4935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720721#comment-13720721
 ] 

Hive QA commented on HIVE-4935:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12594321/HIVE-4935.1.patch

{color:green}SUCCESS:{color} +1 2652 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/191/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/191/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 Potential NPE in MetadataOnlyOptimizer
 --

 Key: HIVE-4935
 URL: https://issues.apache.org/jira/browse/HIVE-4935
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Minor
 Attachments: HIVE-4935.1.patch, HIVE-4935.1.patch


 In MetadataOnlyOptimizer.TableScanProcessor.process, it is possible that we 
 consider a TableScanOperator as MayBeMetadataOnly when this TS does not 
 have a conf. In 
 MetadataOnlyOptimizer.MetadataOnlyTaskDispatcher.dispatch(Node, StackNode, 
 Object...), when we convert this TS, we want to get the alias from its 
 conf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.

[
https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kousuke Saruta updated HIVE-2137:
-

Description:
JDBC driver for HiveServer1 decodes string by client side default encoding,
which depends on operating system unless we don't specify another encoding.
It ignore server side encoding.

In current implementation of Hive, UTF-8 appears to be expected in server side
so client side should encode/decode string as UTF-8.

was:
JDBC driver decodes string by client side default encoding, which depends on
operating system unless we don't specify another encoding.
It ignore server side encoding.

In current implementation of Hive, UTF-8 appears to be expected in server side
so client side should encode/decode string as UTF-8.

JDBC driver doesn't encode string properly.
---

Key: HIVE-2137
URL: https://issues.apache.org/jira/browse/HIVE-2137
Project: Hive
Issue Type: Bug
Components: JDBC
Affects Versions: 0.9.0
Reporter: Jin Adachi
Fix For: 0.12.0

Attachments: HIVE-2137.patch

JDBC driver for HiveServer1 decodes string by client side default encoding,
which depends on operating system unless we don't specify another encoding.
It ignore server side encoding.
For example,
when server side operating system and encoding are Linux (utf-8) and client
side operating system and encoding are Windows (shift-jis : it's japanese
charset, makes character corruption happens in the client.
In current implementation of Hive, UTF-8 appears to be expected in server
side so client side should encode/decode string as UTF-8.

Re: [Discuss] project chop up

2013-07-26 Thread Edward Capriolo

Also i believe hcatalog web can fall into the same designation.

Question , hcatalog was initily a big hive-metastore fork. I was under the
impression that Hcat and hive-metastore was supposed to merge up somehow.
What is the status on that? I remember that was one of the core reasons we
brought it in.

On Friday, July 26, 2013, Edward Capriolo edlinuxg...@gmail.com wrote:
 I prefer option 3 as well.


 On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland br...@cloudera.com wrote:

 On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo edlinuxg...@gmail.com
wrote:

  I have been developing my laptop on a duel core 2 GB Ram laptop for
years
  now. With the addition of hcatalog, hive-thrift2, and some other growth
  trying to develop hive in a eclipse on this machine craws, especially
if
  'build automatically' is turned on. As we look to add on more things
this
  is only going to get worse.
 
  I am also noticing issues like this:
 
  https://issues.apache.org/jira/browse/HIVE-4849
 
  What I think we should do is strip down/out optional parts of hive.
 
  1) Hive Hbase
   This should really be it's own project to do this right we really
have to
  have multiple branches since hbase is not backwards compatible.
 
  2) Hive Web Interface
  Now really a big project but not really critical can be just as easily
be
  build separately
 
  3) hive thrift 1
  We have hive thrift 2 now, it is time for the sun to set on
hivethrift1,
 
  4) odbc
  Not entirely convinced about this one but it is really not critical to
  running hive.
 
  What I think we should do is create sub-projects for the above things
or
  simply move them into directories that do not build with hive. Ideally
they
  would use maven to pull dependencies.
 
  What does everyone think?
 

 I agree that projects like the HBase handler and probably others as well
 should somehow be downstream projects which simply depend on the hive
 jars.  I see a couple alternatives for this:

 * Take the module in question to the Apache Incubator
 * Move the module in question to the Apache Extras
 * Breakup the projects within our own source tree

 I'd prefer the third option at this point.

 Brock



 Brock

Re: Extending Explode

2013-07-26 Thread Edward Capriolo

So the explode output should be in the order of the array or the map you
are exploding.

I think you could use rank or row-sequence to give the exploded attay a
number. If that does not wok adding a new udf might make more sence then
extwnding in this case.:

On Friday, July 26, 2013,  nikolaus.st...@researchgate.net wrote:
 Hi,

 I'd like to make a patch that extends the functionality of explode to
include an output column with the position of each item in the original
array.

 I imagine this could be useful to the greater community and am wondering
if I should extend the current explode function or if I should write a
completely new function. Any thoughts on what will be more useful and more
likely to be added to the hive-trunk would be greatly appreciated.

 Thanks,
 Niko

[jira] [Commented] (HIVE-3926) PPD on virtual column of partitioned table is not working


[ 
https://issues.apache.org/jira/browse/HIVE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720802#comment-13720802
 ] 

Hive QA commented on HIVE-3926:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12594320/HIVE-3926.D8121.5.patch

{color:green}SUCCESS:{color} +1 2653 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/192/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/192/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 PPD on virtual column of partitioned table is not working
 -

 Key: HIVE-3926
 URL: https://issues.apache.org/jira/browse/HIVE-3926
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3926.D8121.1.patch, HIVE-3926.D8121.2.patch, 
 HIVE-3926.D8121.3.patch, HIVE-3926.D8121.4.patch, HIVE-3926.D8121.5.patch


 {code}
 select * from src where BLOCK__OFFSET__INSIDE__FILE100;
 {code}
 is working, but
 {code}
 select * from srcpart where BLOCK__OFFSET__INSIDE__FILE100;
 {code}
 throws SemanticException. Disabling PPD makes it work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4935) Potential NPE in MetadataOnlyOptimizer


 [ 
https://issues.apache.org/jira/browse/HIVE-4935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4935:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Yin!

 Potential NPE in MetadataOnlyOptimizer
 --

 Key: HIVE-4935
 URL: https://issues.apache.org/jira/browse/HIVE-4935
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4935.1.patch, HIVE-4935.1.patch


 In MetadataOnlyOptimizer.TableScanProcessor.process, it is possible that we 
 consider a TableScanOperator as MayBeMetadataOnly when this TS does not 
 have a conf. In 
 MetadataOnlyOptimizer.MetadataOnlyTaskDispatcher.dispatch(Node, StackNode, 
 Object...), when we convert this TS, we want to get the alias from its 
 conf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3632) Upgrade datanucleus to support JDK7

[
https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashutosh Chauhan updated HIVE-3632:
---

Resolution: Fixed
Status: Resolved (was: Patch Available)

Committed to trunk. Thanks, Xuefu!

Upgrade datanucleus to support JDK7
---

Key: HIVE-3632
URL: https://issues.apache.org/jira/browse/HIVE-3632
Project: Hive
Issue Type: Bug
Components: Metastore
Affects Versions: 0.9.1, 0.10.0, 0.11.0
Reporter: Chris Drome
Assignee: Xuefu Zhang
Priority: Critical
Fix For: 0.12.0

Attachments: HIVE-3632.1.patch, HIVE-3632.2.patch, HIVE-3632.3.patch,
HIVE-3632.patch, HIVE-3632.patch.1

I found serious problems with datanucleus code when using JDK7, resulting in
some sort of exception being thrown when datanucleus code is entered.
I tried source=1.7, target=1.7 with JDK7 as well as source=1.6, target=1.6
with JDK7 and there was no visible difference in that the same unit tests
failed.
I tried upgrading datanucleus to 3.0.1, as per HIVE-2084.patch, which did not
fix the failing tests.
I tried upgrading datanucleus to 3.1-release, as per the advise of
http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-86, which suggests
using ASMv4 will allow datanucleus to work with JDK7. I was not successful
with this either.
I tried upgrading datanucleus to 3.1.2. I was not successful with this either.
Regarding datanucleus support for JDK7+, there is the following JIRA
http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-81
which suggests that they don't plan to actively support JDK7+ bytecode any
time soon.
I also tested the following JVM parameters found on
http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/
with no success either.
This will become a more serious problem as people move to newer JVMs. If
there are other who have solved this issue, please post how this was done.
Otherwise, it is a topic that I would like to raise for discussion.
Test Properties:
CLEAR LIBRARY CACHE

[jira] [Resolved] (HIVE-2084) Upgrade datanucleus from 2.0.3 to a more recent version (3.?)


 [ 
https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-2084.


   Resolution: Fixed
Fix Version/s: 0.12.0

This has been fixed via HIVE-3632

 Upgrade datanucleus from 2.0.3 to a more recent version (3.?)
 -

 Key: HIVE-2084
 URL: https://issues.apache.org/jira/browse/HIVE-2084
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ning Zhang
Assignee: Sushanth Sowmyan
  Labels: datanucleus
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2084.D2397.1.patch, 
 HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.D5685.1.patch, 
 HIVE-2084.patch


 It seems the datanucleus 2.2.3 does a better join in caching. The time it 
 takes to get the same set of partition objects takes about 1/4 of the time it 
 took for the first time. While with 2.0.3, it took almost the same amount of 
 time in the second execution. We should retest the test case mentioned in 
 HIVE-1853, HIVE-1862.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-2473) Hive throws an NPE when $HADOOP_HOME points to a tarball install directory that contains a build/ subdirectory.


 [ 
https://issues.apache.org/jira/browse/HIVE-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-2473.


   Resolution: Fixed
Fix Version/s: 0.12.0

This should have been fixed via HIVE-3632 If you can still reproduce, feel free 
to reopen.

 Hive throws an NPE when $HADOOP_HOME points to a tarball install directory 
 that contains a build/ subdirectory.
 ---

 Key: HIVE-2473
 URL: https://issues.apache.org/jira/browse/HIVE-2473
 Project: Hive
  Issue Type: Bug
 Environment: hadoop-0.20.204.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.12.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-2015) Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages

[
https://issues.apache.org/jira/browse/HIVE-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashutosh Chauhan resolved HIVE-2015.

Resolution: Fixed
Fix Version/s: 0.12.0

This should have been fixed via HIVE-3632 Feel free to reopen if you can still
reproduce.

Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages

Key: HIVE-2015
URL: https://issues.apache.org/jira/browse/HIVE-2015
Project: Hive
Issue Type: Bug
Components: Diagnosability, Metastore
Reporter: Carl Steinbach
Assignee: Zhenxiao Luo
Fix For: 0.12.0

Every time I start up the Hive CLI with logging enabled I'm treated to the
following ERROR log messages courtesy of DataNucleus:
{code}
DEBUG metastore.ObjectStore: datanucleus.plugin.pluginRegistryBundleCheck =
LOG
ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires
org.eclipse.core.resources but it cannot be resolved.
ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires
org.eclipse.core.runtime but it cannot be resolved.
ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires
org.eclipse.text but it cannot be resolved.
{code}
Here's where this comes from:
* The bin/hive scripts cause Hive to inherit Hadoop's classpath.
* Hadoop's classpath includes $HADOOP_HOME/lib/core-3.1.1.jar, an Eclipse
library.
* core-3.1.1.jar includes a plugin.xml file defining an OSGI plugin
* At startup, Datanucleus scans the classpath looking for OSGI plugins, and
will attempt to initialize any that it finds, including the Eclipse OSGI
plugins located in core-3.1.1.jar
* Initialization of the OSGI plugin in core-3.1.1.jar fails because of
unresolved dependencies.
* We see an ERROR message telling us that Datanucleus failed to initialize a
plugin that we don't care about in the first place.
I can think of two options for solving this problem:
# Rewrite the scripts in $HIVE_HOME/bin so that they don't inherit ALL of
Hadoop's CLASSPATH.
# Replace DataNucleus's NOnManagedPluginRegistry with our own implementation
that does nothing.

[jira] [Updated] (HIVE-4878) With Dynamic partitioning, some queries would scan default partition even if query is not using it.


 [ 
https://issues.apache.org/jira/browse/HIVE-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4878:
---

   Resolution: Fixed
Fix Version/s: (was: 0.11.1)
   0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, John!

 With Dynamic partitioning, some queries would scan default partition even if 
 query is not using it.
 ---

 Key: HIVE-4878
 URL: https://issues.apache.org/jira/browse/HIVE-4878
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.0

 Attachments: HIVE-4878.patch


 With Dynamic partitioning, Hive would scan default partitions in some cases 
 even if query excludes it.
 As part of partition pruning, predicate is narrowed down to those pieces that 
 involve partition columns only. This predicate is then evaluated with 
 partition values to determine, if scan should include those partitions.
 But in some cases (like when comparing __HIVE_DEFAULT_PARTITION__ to 
 numeric data types) expression evaluation would fail and would return NULL 
 instead of true/false. In such cases the partition is added to unknown 
 partitions which is then subsequently scanned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4927) When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed


 [ 
https://issues.apache.org/jira/browse/HIVE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4927:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Yin!

 When we merge two MapJoin MapRedTasks, the TableScanOperator of the second 
 one should be removed
 

 Key: HIVE-4927
 URL: https://issues.apache.org/jira/browse/HIVE-4927
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Yin Huai
Assignee: Yin Huai
 Fix For: 0.12.0

 Attachments: HIVE-4927.D11811.1.patch, HIVE-4927.D11811.2.patch, 
 HIVE-4927.D11811.3.patch, HIVE-4927.D11811.3.patch


 {code}
 set hive.auto.convert.join=true;
 set hive.auto.convert.join.noconditionaltask=true;
 EXPLAIN
 SELECT x1.key AS key FROM src x1 JOIN src1 y1 ON (x1.key = y1.key) JOIN src1 
 y2 ON (x1.value = y2.value) GROUP BY x1.key;
 {\code}
 We will get a NPE from MetadataOnlyOptimizer. The reason is that the operator 
 tree of the MapRedTask evaluating two MapJoins is 
 {code}
 TS1-MapJoin1-TS2-MapJoin2-...
 {\code}
 We should remove the TS2...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs

Yin Huai created HIVE-4942:
--

 Summary: Fix eclipse template files to use correct datanucleus libs
 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial


HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4920) PTest2 spot instances should fall back on c1.xlarge and then on-demand instances


[ 
https://issues.apache.org/jira/browse/HIVE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720915#comment-13720915
 ] 

Brock Noland commented on HIVE-4920:


I think we should also be able to block the test until prices come back down 
and then allocate new hosts during the middle of the test.

 PTest2 spot instances should fall back on c1.xlarge and then on-demand 
 instances
 

 Key: HIVE-4920
 URL: https://issues.apache.org/jira/browse/HIVE-4920
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Critical
 Attachments: Screen Shot 2013-07-23 at 3.35.00 PM.png


 Today the price for m1.xlarge instances has been varying dramatically. We 
 should fall back on c1.xlarge (which is more powerful and is cheaper at 
 present) and then on on-demand instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


[ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720916#comment-13720916
 ] 

Yin Huai commented on HIVE-4942:


We need HIVE-2739

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial

 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4942:
---

Attachment: HIVE-4942.1.patch

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial
 Attachments: HIVE-4942.1.patch


 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4942:
---

Attachment: (was: HIVE-4942.patch)

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial

 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4942:
---

Attachment: HIVE-4942.patch

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial

 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4942:
---

Attachment: (was: HIVE-4942.1.patch)

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial

 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4942:
---

Attachment: (was: HIVE-4942)

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial

 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4942:
---

Attachment: HIVE-4942

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial

 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4942:
---

Attachment: HIVE-4942.txt

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial
 Attachments: HIVE-4942.txt


 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4942:
---

Status: Patch Available  (was: Open)

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial
 Attachments: HIVE-4942.txt


 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4920) PTest2 spot instances should fall back on c1.xlarge and then on-demand instances


[ 
https://issues.apache.org/jira/browse/HIVE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720922#comment-13720922
 ] 

Edward Capriolo commented on HIVE-4920:
---

Are you trying to build a high frequency trading system or a test suite :) JK. 
We should be optimal but lets not spend too much time gaming the system. I 
think we are better off following up with the discussion on list of chopping up 
the project.

 PTest2 spot instances should fall back on c1.xlarge and then on-demand 
 instances
 

 Key: HIVE-4920
 URL: https://issues.apache.org/jira/browse/HIVE-4920
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Critical
 Attachments: Screen Shot 2013-07-23 at 3.35.00 PM.png


 Today the price for m1.xlarge instances has been varying dramatically. We 
 should fall back on c1.xlarge (which is more powerful and is cheaper at 
 present) and then on on-demand instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4943) An explode function that includes the item's position in the array

2013-07-26 Thread Niko Stahl (JIRA)

Niko Stahl created HIVE-4943:


 Summary: An explode function that includes the item's position in 
the array
 Key: HIVE-4943
 URL: https://issues.apache.org/jira/browse/HIVE-4943
 Project: Hive
  Issue Type: New Feature
Reporter: Niko Stahl


A function that explodes an array and includes an output column with the 
position of each item in the original array.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4512) The vectorized plan is not picking right expression class for string concatenation.


[ 
https://issues.apache.org/jira/browse/HIVE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720945#comment-13720945
 ] 

Hive QA commented on HIVE-4512:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12594218/HIVE-4512.2-vectorization.patch

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/193/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/193/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests failed with: IllegalStateException: Too many bad hosts: 0.6% (6 / 10) is 
greater than threshold of 50%
{noformat}

This message is automatically generated.

 The vectorized plan is not picking right expression class for string 
 concatenation.
 ---

 Key: HIVE-4512
 URL: https://issues.apache.org/jira/browse/HIVE-4512
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Jitendra Nath Pandey
Assignee: Eric Hanson
 Attachments: HIVE-4512.1-vectorization.patch, 
 HIVE-4512.2-vectorization.patch


 The vectorized plan is not picking right expression class for string 
 concatenation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Extending Explode

2013-07-26 Thread Jerome Banks

FYI,
  Brickhouse provides a numeric_range UDTF, which explodes integer values,
and an array_index UDF, so you could solve your problem by exploding on a
numeric range of the size of the array. ie.

select n, array_index(arr, n )
  from mytable
lateral view numeric_range(0, size(arr) -1 ) n1 as n
;

-- jerome


On Fri, Jul 26, 2013 at 6:40 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 So the explode output should be in the order of the array or the map you
 are exploding.

 I think you could use rank or row-sequence to give the exploded attay a
 number. If that does not wok adding a new udf might make more sence then
 extwnding in this case.:

 On Friday, July 26, 2013,  nikolaus.st...@researchgate.net wrote:
  Hi,
 
  I'd like to make a patch that extends the functionality of explode to
 include an output column with the position of each item in the original
 array.
 
  I imagine this could be useful to the greater community and am wondering
 if I should extend the current explode function or if I should write a
 completely new function. Any thoughts on what will be more useful and more
 likely to be added to the hive-trunk would be greatly appreciated.
 
  Thanks,
  Niko

[jira] [Commented] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


[ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720968#comment-13720968
 ] 

Ashutosh Chauhan commented on HIVE-4942:


+1

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial
 Attachments: HIVE-4942.txt


 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [Discuss] project chop up

2013-07-26 Thread Thejas Nair

+1 to the idea of making the build of core hive and other downstream
components independent.

bq.  I was under the impression that Hcat and hive-metastore was
supposed to merge up somehow.

The metastore code was never forked. Hcat was just using
hive-metastore and making the metadata available to rest of hadoop
(pig, java MR..).
A lot of the changes that were driven by hcat goals were being made in
hive-metastore. You can think of hcat as set of libraries that let pig
and java MR use hive metastore. Since hcat is closely tied to
hive-metastore, it makes sense to have them in same project.


On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo edlinuxg...@gmail.com wrote:
 Also i believe hcatalog web can fall into the same designation.

 Question , hcatalog was initily a big hive-metastore fork. I was under the
 impression that Hcat and hive-metastore was supposed to merge up somehow.
 What is the status on that? I remember that was one of the core reasons we
 brought it in.

 On Friday, July 26, 2013, Edward Capriolo edlinuxg...@gmail.com wrote:
 I prefer option 3 as well.


 On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland br...@cloudera.com wrote:

 On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo edlinuxg...@gmail.com
wrote:

  I have been developing my laptop on a duel core 2 GB Ram laptop for
 years
  now. With the addition of hcatalog, hive-thrift2, and some other growth
  trying to develop hive in a eclipse on this machine craws, especially
 if
  'build automatically' is turned on. As we look to add on more things
 this
  is only going to get worse.
 
  I am also noticing issues like this:
 
  https://issues.apache.org/jira/browse/HIVE-4849
 
  What I think we should do is strip down/out optional parts of hive.
 
  1) Hive Hbase
   This should really be it's own project to do this right we really
 have to
  have multiple branches since hbase is not backwards compatible.
 
  2) Hive Web Interface
  Now really a big project but not really critical can be just as easily
 be
  build separately
 
  3) hive thrift 1
  We have hive thrift 2 now, it is time for the sun to set on
 hivethrift1,
 
  4) odbc
  Not entirely convinced about this one but it is really not critical to
  running hive.
 
  What I think we should do is create sub-projects for the above things
 or
  simply move them into directories that do not build with hive. Ideally
 they
  would use maven to pull dependencies.
 
  What does everyone think?
 

 I agree that projects like the HBase handler and probably others as well
 should somehow be downstream projects which simply depend on the hive
 jars.  I see a couple alternatives for this:

 * Take the module in question to the Apache Incubator
 * Move the module in question to the Apache Extras
 * Breakup the projects within our own source tree

 I'd prefer the third option at this point.

 Brock



 Brock

[jira] [Updated] (HIVE-4929) the type of all numeric constants is changed to double in the plan


 [ 
https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-4929:
---

Attachment: HIVE-4929.patch

 the type of all numeric constants is changed to double in the plan
 --

 Key: HIVE-4929
 URL: https://issues.apache.org/jira/browse/HIVE-4929
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-4929.patch


 There's code which, after the numeric type for a constant in where clause has 
 been chosen as the most restricted one or based on suffix, tries to change 
 the type to match the numeric column  which the constant is being compared 
 with. However, due to a hack from HIVE-3059 every column type shows up as 
 string in that code, causing it to always change the constant type to double. 
 This should not be done (regardless of the hack).
 Spinoff from HIVE-2702, large number of query outputs change so it will be a 
 big patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4929) the type of all numeric constants is changed to double in the plan


 [ 
https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-4929:
---

Status: Patch Available  (was: Open)

 the type of all numeric constants is changed to double in the plan
 --

 Key: HIVE-4929
 URL: https://issues.apache.org/jira/browse/HIVE-4929
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-4929.patch


 There's code which, after the numeric type for a constant in where clause has 
 been chosen as the most restricted one or based on suffix, tries to change 
 the type to match the numeric column  which the constant is being compared 
 with. However, due to a hack from HIVE-3059 every column type shows up as 
 string in that code, causing it to always change the constant type to double. 
 This should not be done (regardless of the hack).
 Spinoff from HIVE-2702, large number of query outputs change so it will be a 
 big patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan


[ 
https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720998#comment-13720998
 ] 

Sergey Shelukhin commented on HIVE-4929:


Review uploaded to RB (Phabricator errors out, seemingly due to size)

 the type of all numeric constants is changed to double in the plan
 --

 Key: HIVE-4929
 URL: https://issues.apache.org/jira/browse/HIVE-4929
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-4929.patch


 There's code which, after the numeric type for a constant in where clause has 
 been chosen as the most restricted one or based on suffix, tries to change 
 the type to match the numeric column  which the constant is being compared 
 with. However, due to a hack from HIVE-3059 every column type shows up as 
 string in that code, causing it to always change the constant type to double. 
 This should not be done (regardless of the hack).
 Spinoff from HIVE-2702, large number of query outputs change so it will be a 
 big patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan


[ 
https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721006#comment-13721006
 ] 

Ashutosh Chauhan commented on HIVE-4929:


[~sershe] Can you post the RB link here ?

 the type of all numeric constants is changed to double in the plan
 --

 Key: HIVE-4929
 URL: https://issues.apache.org/jira/browse/HIVE-4929
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-4929.patch


 There's code which, after the numeric type for a constant in where clause has 
 been chosen as the most restricted one or based on suffix, tries to change 
 the type to match the numeric column  which the constant is being compared 
 with. However, due to a hack from HIVE-3059 every column type shows up as 
 string in that code, causing it to always change the constant type to double. 
 This should not be done (regardless of the hack).
 Spinoff from HIVE-2702, large number of query outputs change so it will be a 
 big patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan


[ 
https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721029#comment-13721029
 ] 

Sergey Shelukhin commented on HIVE-4929:


https://reviews.apache.org/r/12974/ 

 the type of all numeric constants is changed to double in the plan
 --

 Key: HIVE-4929
 URL: https://issues.apache.org/jira/browse/HIVE-4929
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-4929.patch


 There's code which, after the numeric type for a constant in where clause has 
 been chosen as the most restricted one or based on suffix, tries to change 
 the type to match the numeric column  which the constant is being compared 
 with. However, due to a hack from HIVE-3059 every column type shows up as 
 string in that code, causing it to always change the constant type to double. 
 This should not be done (regardless of the hack).
 Spinoff from HIVE-2702, large number of query outputs change so it will be a 
 big patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan


[ 
https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721052#comment-13721052
 ] 

Ashutosh Chauhan commented on HIVE-4929:


+1

 the type of all numeric constants is changed to double in the plan
 --

 Key: HIVE-4929
 URL: https://issues.apache.org/jira/browse/HIVE-4929
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-4929.patch


 There's code which, after the numeric type for a constant in where clause has 
 been chosen as the most restricted one or based on suffix, tries to change 
 the type to match the numeric column  which the constant is being compared 
 with. However, due to a hack from HIVE-3059 every column type shows up as 
 string in that code, causing it to always change the constant type to double. 
 This should not be done (regardless of the hack).
 Spinoff from HIVE-2702, large number of query outputs change so it will be a 
 big patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request 12795: [HIVE-4827] Merge a Map-only job to its following MapReduce job with multiple inputs

2013-07-26 Thread Yin Huai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12795/
---

(Updated July 26, 2013, 6:50 p.m.)


Review request for hive.


Changes
---

update test results


Bugs: HIVE-4827
https://issues.apache.org/jira/browse/HIVE-4827


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-4827


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8f1af64 
  conf/hive-default.xml.template 69b85dc 
  eclipse-templates/.classpath 44e6c62 
  eclipse-templates/.classpath._hbase 397918d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 66b84ff 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java b5a9291 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
 5340d3c 
  ql/src/test/queries/clientpositive/auto_join33.q 5c85842 
  ql/src/test/queries/clientpositive/correlationoptimizer1.q 2adf855 
  ql/src/test/queries/clientpositive/correlationoptimizer3.q fcbb764 
  ql/src/test/queries/clientpositive/correlationoptimizer4.q 0e84cb7 
  ql/src/test/queries/clientpositive/correlationoptimizer5.q 1900f5d 
  ql/src/test/queries/clientpositive/correlationoptimizer6.q 88d790c 
  ql/src/test/queries/clientpositive/correlationoptimizer7.q 9b18972 
  ql/src/test/queries/clientpositive/multiMapJoin1.q 86b0586 
  ql/src/test/queries/clientpositive/multiMapJoin2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union34.q a88e395 
  ql/src/test/results/clientpositive/auto_join0.q.out 652cb76 
  ql/src/test/results/clientpositive/auto_join10.q.out deb8eb5 
  ql/src/test/results/clientpositive/auto_join11.q.out 939f512 
  ql/src/test/results/clientpositive/auto_join12.q.out 23ed0fc 
  ql/src/test/results/clientpositive/auto_join13.q.out 7e0f41d 
  ql/src/test/results/clientpositive/auto_join15.q.out aa40cff 
  ql/src/test/results/clientpositive/auto_join16.q.out e8f1435 
  ql/src/test/results/clientpositive/auto_join2.q.out a11f347 
  ql/src/test/results/clientpositive/auto_join20.q.out 13722ec 
  ql/src/test/results/clientpositive/auto_join21.q.out 79693fe 
  ql/src/test/results/clientpositive/auto_join22.q.out 6f418db 
  ql/src/test/results/clientpositive/auto_join23.q.out 2755ee1 
  ql/src/test/results/clientpositive/auto_join24.q.out c7e872e 
  ql/src/test/results/clientpositive/auto_join26.q.out 7268755 
  ql/src/test/results/clientpositive/auto_join28.q.out 407303c 
  ql/src/test/results/clientpositive/auto_join29.q.out dec2187 
  ql/src/test/results/clientpositive/auto_join32.q.out 312664a 
  ql/src/test/results/clientpositive/auto_join33.q.out 8fc0e84 
  ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out da375f6 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out 42e25fa 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 2ec3cf3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out 6add99a 
  ql/src/test/results/clientpositive/correlationoptimizer1.q.out db3bd78 
  ql/src/test/results/clientpositive/correlationoptimizer3.q.out cfa7eff 
  ql/src/test/results/clientpositive/correlationoptimizer4.q.out 285a54f 
  ql/src/test/results/clientpositive/correlationoptimizer6.q.out b0438e6 
  ql/src/test/results/clientpositive/correlationoptimizer7.q.out f8db2bf 
  ql/src/test/results/clientpositive/join28.q.out 60165e2 
  ql/src/test/results/clientpositive/join32.q.out af37f54 
  ql/src/test/results/clientpositive/join33.q.out af37f54 
  ql/src/test/results/clientpositive/join_star.q.out 797b892 
  ql/src/test/results/clientpositive/mapjoin_filter_on_outerjoin.q.out ca21c6c 
  ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out 2f5f613 
  ql/src/test/results/clientpositive/mapjoin_subquery.q.out 8243c2c 
  ql/src/test/results/clientpositive/mapjoin_subquery2.q.out 292abe4 
  ql/src/test/results/clientpositive/mapjoin_test_outer.q.out 37817d9 
  ql/src/test/results/clientpositive/multiMapJoin1.q.out a3f5c53 
  ql/src/test/results/clientpositive/multiMapJoin2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/multi_join_union.q.out 5182bdf 
  ql/src/test/results/clientpositive/union34.q.out 166062a 

Diff: https://reviews.apache.org/r/12795/diff/


Testing
---

Running tests.


Thanks,

Yin Huai

[jira] [Updated] (HIVE-4827) Merge a Map-only job to its following MapReduce job with multiple inputs


 [ 
https://issues.apache.org/jira/browse/HIVE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4827:
---

Status: Patch Available  (was: Open)

 Merge a Map-only job to its following MapReduce job with multiple inputs
 

 Key: HIVE-4827
 URL: https://issues.apache.org/jira/browse/HIVE-4827
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4827.1.patch, HIVE-4827.2.patch, HIVE-4827.3.patch, 
 HIVE-4827.4.patch


 When hive.optimize.mapjoin.mapreduce is on, CommonJoinResolver can attach a 
 Map-only job (MapJoin) to its following MapReduce job. But this merge only 
 happens when the MapReduce job has a single input. With Correlation Optimizer 
 (HIVE-2206), it is possible that the MapReduce job can have multiple inputs 
 (for multiple operation paths). It is desired to improve CommonJoinResolver 
 to merge a Map-only job to the corresponding Map task of the MapReduce job.
 Example:
 {code:sql}
 set hive.optimize.correlation=true;
 set hive.auto.convert.join=true;
 set hive.optimize.mapjoin.mapreduce=true;
 SELECT tmp1.key, count(*)
 FROM (SELECT x1.key1 AS key
   FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1)
   GROUP BY x1.key1) tmp1
 JOIN (SELECT x2.key2 AS key
   FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key2 = y2.key2)
   GROUP BY x2.key2) tmp2
 ON (tmp1.key = tmp2.key)
 GROUP BY tmp1.key;
 {\code}
 In this query, join operations inside tmp1 and tmp2 will be converted to two 
 MapJoins. With Correlation Optimizer, aggregations in tmp1, tmp2, and join of 
 tmp1 and tmp2, and the last aggregation will be executed in the same 
 MapReduce job (Reduce side). Since this MapReduce job has two inputs, right 
 now, CommonJoinResolver cannot attach two MapJoins to the Map side of a 
 MapReduce job.
 Another example:
 {code:sql}
 SELECT tmp1.key
 FROM (SELECT x1.key2 AS key
   FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1)
   UNION ALL
   SELECT x2.key2 AS key
   FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key1 = y2.key1)) tmp1
 {\code}
 For this case, we will have three Map-only jobs (two for MapJoins and one for 
 Union). It will be good to use a single Map-only job to execute this query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4827) Merge a Map-only job to its following MapReduce job with multiple inputs


 [ 
https://issues.apache.org/jira/browse/HIVE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4827:
---

Attachment: HIVE-4827.4.patch

 Merge a Map-only job to its following MapReduce job with multiple inputs
 

 Key: HIVE-4827
 URL: https://issues.apache.org/jira/browse/HIVE-4827
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4827.1.patch, HIVE-4827.2.patch, HIVE-4827.3.patch, 
 HIVE-4827.4.patch


 When hive.optimize.mapjoin.mapreduce is on, CommonJoinResolver can attach a 
 Map-only job (MapJoin) to its following MapReduce job. But this merge only 
 happens when the MapReduce job has a single input. With Correlation Optimizer 
 (HIVE-2206), it is possible that the MapReduce job can have multiple inputs 
 (for multiple operation paths). It is desired to improve CommonJoinResolver 
 to merge a Map-only job to the corresponding Map task of the MapReduce job.
 Example:
 {code:sql}
 set hive.optimize.correlation=true;
 set hive.auto.convert.join=true;
 set hive.optimize.mapjoin.mapreduce=true;
 SELECT tmp1.key, count(*)
 FROM (SELECT x1.key1 AS key
   FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1)
   GROUP BY x1.key1) tmp1
 JOIN (SELECT x2.key2 AS key
   FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key2 = y2.key2)
   GROUP BY x2.key2) tmp2
 ON (tmp1.key = tmp2.key)
 GROUP BY tmp1.key;
 {\code}
 In this query, join operations inside tmp1 and tmp2 will be converted to two 
 MapJoins. With Correlation Optimizer, aggregations in tmp1, tmp2, and join of 
 tmp1 and tmp2, and the last aggregation will be executed in the same 
 MapReduce job (Reduce side). Since this MapReduce job has two inputs, right 
 now, CommonJoinResolver cannot attach two MapJoins to the Map side of a 
 MapReduce job.
 Another example:
 {code:sql}
 SELECT tmp1.key
 FROM (SELECT x1.key2 AS key
   FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1)
   UNION ALL
   SELECT x2.key2 AS key
   FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key1 = y2.key1)) tmp1
 {\code}
 For this case, we will have three Map-only jobs (two for MapJoins and one for 
 Union). It will be good to use a single Map-only job to execute this query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3926) PPD on virtual column of partitioned table is not working


[ 
https://issues.apache.org/jira/browse/HIVE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721060#comment-13721060
 ] 

Phabricator commented on HIVE-3926:
---

hagleitn has accepted the revision HIVE-3926 [jira] PPD on virtual column of 
partitioned table is not working.

  LGTM

REVISION DETAIL
  https://reviews.facebook.net/D8121

BRANCH
  HIVE-3926

ARCANIST PROJECT
  hive

To: JIRA, hagleitn, navis
Cc: hagleitn


 PPD on virtual column of partitioned table is not working
 -

 Key: HIVE-3926
 URL: https://issues.apache.org/jira/browse/HIVE-3926
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3926.D8121.1.patch, HIVE-3926.D8121.2.patch, 
 HIVE-3926.D8121.3.patch, HIVE-3926.D8121.4.patch, HIVE-3926.D8121.5.patch


 {code}
 select * from src where BLOCK__OFFSET__INSIDE__FILE100;
 {code}
 is working, but
 {code}
 select * from srcpart where BLOCK__OFFSET__INSIDE__FILE100;
 {code}
 throws SemanticException. Disabling PPD makes it work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4827) Merge a Map-only job to its following MapReduce job with multiple inputs


[ 
https://issues.apache.org/jira/browse/HIVE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721062#comment-13721062
 ] 

Yin Huai commented on HIVE-4827:


Updated patch has been uploaded to RB. Mark it as PA to trigger precommit tests.

 Merge a Map-only job to its following MapReduce job with multiple inputs
 

 Key: HIVE-4827
 URL: https://issues.apache.org/jira/browse/HIVE-4827
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4827.1.patch, HIVE-4827.2.patch, HIVE-4827.3.patch, 
 HIVE-4827.4.patch


 When hive.optimize.mapjoin.mapreduce is on, CommonJoinResolver can attach a 
 Map-only job (MapJoin) to its following MapReduce job. But this merge only 
 happens when the MapReduce job has a single input. With Correlation Optimizer 
 (HIVE-2206), it is possible that the MapReduce job can have multiple inputs 
 (for multiple operation paths). It is desired to improve CommonJoinResolver 
 to merge a Map-only job to the corresponding Map task of the MapReduce job.
 Example:
 {code:sql}
 set hive.optimize.correlation=true;
 set hive.auto.convert.join=true;
 set hive.optimize.mapjoin.mapreduce=true;
 SELECT tmp1.key, count(*)
 FROM (SELECT x1.key1 AS key
   FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1)
   GROUP BY x1.key1) tmp1
 JOIN (SELECT x2.key2 AS key
   FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key2 = y2.key2)
   GROUP BY x2.key2) tmp2
 ON (tmp1.key = tmp2.key)
 GROUP BY tmp1.key;
 {\code}
 In this query, join operations inside tmp1 and tmp2 will be converted to two 
 MapJoins. With Correlation Optimizer, aggregations in tmp1, tmp2, and join of 
 tmp1 and tmp2, and the last aggregation will be executed in the same 
 MapReduce job (Reduce side). Since this MapReduce job has two inputs, right 
 now, CommonJoinResolver cannot attach two MapJoins to the Map side of a 
 MapReduce job.
 Another example:
 {code:sql}
 SELECT tmp1.key
 FROM (SELECT x1.key2 AS key
   FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1)
   UNION ALL
   SELECT x2.key2 AS key
   FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key1 = y2.key1)) tmp1
 {\code}
 For this case, we will have three Map-only jobs (two for MapJoins and one for 
 Union). It will be good to use a single Map-only job to execute this query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4470) HS2 should disable local query execution

[
https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721070#comment-13721070
]

Gunther Hagleitner commented on HIVE-4470:
--

Admin flag sounds good to me too.

{quote}
By performance penalty, do you mean the increased latency because of MR job
launching?
{quote}

It's much worse than that. There's no way right now to run the local stage of a
map join anywhere but on the client machine, which is the HS2 machine in this
case. So, you could either disable map joins altogether for HS2 through admin
flag (which means really expensive shuffle joins for everything), or do the
work to be able to run the hash table gen in the cluster, which makes this
ticket really huge.

HS2 should disable local query execution

Key: HIVE-4470
URL: https://issues.apache.org/jira/browse/HIVE-4470
Project: Hive
Issue Type: Bug
Components: HiveServer2
Reporter: Thejas M Nair

Hive can run queries in local mode (instead of using a cluster), if the size
is small. This happens when hive.exec.mode.local.auto is set to true.
This would affect the stability of the hive server2 node, if you have heavy
query processing happening on it. Bugs in udfs triggered by a bad record can
potentially add very heavy load making the server inaccessible.
By default, HS2 should set these parameters to disallow local execution or
send and error message if user tries to set these.

[jira] [Commented] (HIVE-4343) HS2 with kerberos- local task for map join fails


[ 
https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721075#comment-13721075
 ] 

Gunther Hagleitner commented on HIVE-4343:
--

I think we should move forward with this and consider HIVE-4470 as orthogonal. 
People might still want to run local work on the HS2. I agree that this is 
potentially dangerous and probably not a good default, but on HIVE-4470 the 
recommendation is to have a admin flag for on/off. IMO, this ticket should 
still go in.

 HS2 with kerberos- local task for map join fails
 

 Key: HIVE-4343
 URL: https://issues.apache.org/jira/browse/HIVE-4343
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4343.1.patch


 With hive server2 configured with kerberos security, when a (map) join query 
 is run, it results in failure with GSSException: No valid credentials 
 provided 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4892) PTest2 cleanup after merge


[ 
https://issues.apache.org/jira/browse/HIVE-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721107#comment-13721107
 ] 

Brock Noland commented on HIVE-4892:


Hey,

This patch contain some deletes and therefore left some empty files when it was 
applied. We should execute an addendum commit:

{noformat}
svn rm 
./testutils/ptest2/src/test/resources/test-outputs/TEST-SomeTest-truncated.xml 
./testutils/ptest2/src/test/resources/test-outputs/TEST-skewjoin.q-ab8536a7-1b5c-45ed-ba29-14450f27db8b-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml
 
./testutils/ptest2/src/test/resources/test-outputs/TEST-union_remove_9.q-acb9de8f-1b9c-4874-924c-b2107ca7b07c-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml
 
./testutils/ptest2/src/test/resources/test-outputs/TEST-skewjoin_union_remove_1.q-6fa31776-d2b0-4e13-9761-11f750627ad1-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml
 
./testutils/ptest2/src/test/resources/test-outputs/TEST-index_auth.q-bucketcontex-ba31fb54-1d7f-4c70-a89d-477b7d155191-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml
 ./testutils/ptest2/src/test/resources/TEST-SomeTest-failure.xml
{noformat}



 PTest2 cleanup after merge
 --

 Key: HIVE-4892
 URL: https://issues.apache.org/jira/browse/HIVE-4892
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.12.0

 Attachments: HIVE-4892.patch, HIVE-4892.patch


 HIVE-4675 was merged but there are still a few minor issues we need to 
 cleanup:
 * README is out of date
 * Need to limit the number of failed source directories we copy back from the 
 slaves
 * when looking for TEST-*.xml files we look at both the log directory (good) 
 and the failed source directories (bad) therefore duplicating failures in 
 jenkins report
 * We need to process bad hosts in the finally block of PTest.run (HIVE-4882)
 * Need a mechanism to clean the ivy and maven cache (HIVE-4882)
 * PTest2 fails to publish a comment to a JIRA sometimes (HIVE-4889)
 * Now that PTest2 is committed to the source tree it's copying in our 
 TEST-SomeTest*.xml files
 Test Properties:
 NO PRECOMMIT TESTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-305) Port Hadoop streaming's counters/status reporters to Hive Transforms


[ 
https://issues.apache.org/jira/browse/HIVE-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721110#comment-13721110
 ] 

Brock Noland commented on HIVE-305:
---

+1

 Port Hadoop streaming's counters/status reporters to Hive Transforms
 

 Key: HIVE-305
 URL: https://issues.apache.org/jira/browse/HIVE-305
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Venky Iyer
Assignee: Guo Hongjie
 Attachments: HIVE-305.1.patch, HIVE-305.2.patch, hive-305.3.diff.txt, 
 HIVE-305.patch.txt


 https://issues.apache.org/jira/browse/HADOOP-1328
  Introduced a way for a streaming process to update global counters and 
 status using stderr stream to emit information. Use 
 reporter:counter:group,counter,amount  to update  a counter. Use 
 reporter:status:message to update status. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2137) JDBC driver doesn't encode string properly.


[ 
https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721104#comment-13721104
 ] 

Hive QA commented on HIVE-2137:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12501499/HIVE-2137.patch

{color:green}SUCCESS:{color} +1 2653 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/194/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/194/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 JDBC driver doesn't encode string properly.
 ---

 Key: HIVE-2137
 URL: https://issues.apache.org/jira/browse/HIVE-2137
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.9.0
Reporter: Jin Adachi
 Fix For: 0.12.0

 Attachments: HIVE-2137.patch


 JDBC driver for HiveServer1 decodes string by client side default encoding, 
 which depends on operating system unless we don't specify another encoding. 
 It ignore server side encoding. 
 For example, 
 when server side operating system and encoding are Linux (utf-8) and client 
 side operating system and encoding are Windows (shift-jis : it's japanese 
 charset, makes character corruption happens in the client.
 In current implementation of Hive, UTF-8 appears to be expected in server 
 side so client side should encode/decode string as UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4470) HS2 should disable local query execution


[ 
https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721117#comment-13721117
 ] 

Edward Capriolo commented on HIVE-4470:
---

Adding an option is nice, but I do not see how it is enforceable since HiveConf 
can be changed by the user.

 HS2 should disable local query execution
 

 Key: HIVE-4470
 URL: https://issues.apache.org/jira/browse/HIVE-4470
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair

 Hive can run queries in local mode (instead of using a cluster), if the size 
 is small. This happens when hive.exec.mode.local.auto is set to true.
 This would affect the stability of the hive server2 node, if you have heavy 
 query processing happening on it. Bugs in udfs triggered by a bad record can 
 potentially add very heavy load making the server inaccessible. 
 By default, HS2 should set these parameters to disallow local execution or 
 send and error message if user tries to set these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs

2013-07-26 Thread Jeff Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721125#comment-13721125
 ] 

Jeff Wu commented on HIVE-1545:
---

Trying to compile these and load the jar but the package 
com.facebook.hive.udf.tests isn't included. Can someone attach that?

 Add a bunch of UDFs and UDAFs
 -

 Key: HIVE-1545
 URL: https://issues.apache.org/jira/browse/HIVE-1545
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Jonathan Chang
Assignee: Jonathan Chang
Priority: Minor
 Attachments: core.tar.gz, ext.tar.gz, UDFEndsWith.java, 
 UDFFindInString.java, UDFLtrim.java, UDFRtrim.java, udfs.tar.gz, udfs.tar.gz, 
 UDFStartsWith.java, UDFTrim.java


 Here some UD(A)Fs which can be incorporated into the Hive distribution:
 UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 
 5, 3) returns 1.
 UDFBucket - Find the bucket in which the first argument belongs. e.g., 
 BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x  b_{i} 
 but = b_{i+1}. Returns 0 if x is smaller than all the buckets.
 UDFFindInArray - Finds the 1-index of the first element in the array given as 
 the second argument. Returns 0 if not found. Returns NULL if either argument 
 is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, 
 array(1,2,3)) will return 0.
 UDFGreatCircleDist - Finds the great circle distance (in km) between two 
 lat/long coordinates (in degrees).
 UDFLDA - Performs LDA inference on a vector given fixed topics.
 UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 
 whenever any of its parameters changes.
 UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 
 5.
 UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches 
 in an array.
 UDFUnescape - Returns the string unescaped (using C/Java style unescaping).
 UDFWhich - Given a boolean array, return the indices which are TRUE.
 UDFJaccard
 UDAFCollect - Takes all the values associated with a row and converts it into 
 a list. Make sure to have: set hive.map.aggr = false;
 UDAFCollectMap - Like collect except that it takes tuples and generates a map.
 UDAFEntropy - Compute the entropy of a column.
 UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two 
 columns.
 UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value 
 of VAL.
 UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated 
 with the N (passed as the third parameter) largest values of VAL.
 UDAFHistogram

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4928) Date literals do not work properly in partition spec clause


 [ 
https://issues.apache.org/jira/browse/HIVE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-4928:
-

Status: Patch Available  (was: Open)

 Date literals do not work properly in partition spec clause
 ---

 Key: HIVE-4928
 URL: https://issues.apache.org/jira/browse/HIVE-4928
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-4928.1.patch.txt


 The partition spec parsing doesn't do any actual real evaluation of the 
 values in the partition spec, instead just taking the text value of the 
 ASTNode representing the partition value. This works fine for string/numeric 
 literals (expression tree below):
 (TOK_PARTVAL region 99)
 But not for Date literals which are of form DATE '-mm-dd' (expression 
 tree below:
 (TOK_DATELITERAL '1999-12-31')
 In this case the parser/analyzer uses TOK_DATELITERAL as the partition 
 column value, when it should really get value of the child of the DATELITERAL 
 token.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4928) Date literals do not work properly in partition spec clause


 [ 
https://issues.apache.org/jira/browse/HIVE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-4928:
-

Attachment: HIVE-4928.1.patch.txt

Patch changes the parsing of the date literal so the DATELITERAL contains the 
date string value. This makes it more consistent with the ASTNodes generated 
for the other type literals.

 Date literals do not work properly in partition spec clause
 ---

 Key: HIVE-4928
 URL: https://issues.apache.org/jira/browse/HIVE-4928
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-4928.1.patch.txt


 The partition spec parsing doesn't do any actual real evaluation of the 
 values in the partition spec, instead just taking the text value of the 
 ASTNode representing the partition value. This works fine for string/numeric 
 literals (expression tree below):
 (TOK_PARTVAL region 99)
 But not for Date literals which are of form DATE '-mm-dd' (expression 
 tree below:
 (TOK_DATELITERAL '1999-12-31')
 In this case the parser/analyzer uses TOK_DATELITERAL as the partition 
 column value, when it should really get value of the child of the DATELITERAL 
 token.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs

2013-07-26 Thread Jonathan Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721143#comment-13721143
 ] 

Jonathan Chang commented on HIVE-1545:
--

I think they should be migrated to use the equivalent facilities in the PDK?

 Add a bunch of UDFs and UDAFs
 -

 Key: HIVE-1545
 URL: https://issues.apache.org/jira/browse/HIVE-1545
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Jonathan Chang
Assignee: Jonathan Chang
Priority: Minor
 Attachments: core.tar.gz, ext.tar.gz, UDFEndsWith.java, 
 UDFFindInString.java, UDFLtrim.java, UDFRtrim.java, udfs.tar.gz, udfs.tar.gz, 
 UDFStartsWith.java, UDFTrim.java


 Here some UD(A)Fs which can be incorporated into the Hive distribution:
 UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 
 5, 3) returns 1.
 UDFBucket - Find the bucket in which the first argument belongs. e.g., 
 BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x  b_{i} 
 but = b_{i+1}. Returns 0 if x is smaller than all the buckets.
 UDFFindInArray - Finds the 1-index of the first element in the array given as 
 the second argument. Returns 0 if not found. Returns NULL if either argument 
 is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, 
 array(1,2,3)) will return 0.
 UDFGreatCircleDist - Finds the great circle distance (in km) between two 
 lat/long coordinates (in degrees).
 UDFLDA - Performs LDA inference on a vector given fixed topics.
 UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 
 whenever any of its parameters changes.
 UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 
 5.
 UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches 
 in an array.
 UDFUnescape - Returns the string unescaped (using C/Java style unescaping).
 UDFWhich - Given a boolean array, return the indices which are TRUE.
 UDFJaccard
 UDAFCollect - Takes all the values associated with a row and converts it into 
 a list. Make sure to have: set hive.map.aggr = false;
 UDAFCollectMap - Like collect except that it takes tuples and generates a map.
 UDAFEntropy - Compute the entropy of a column.
 UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two 
 columns.
 UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value 
 of VAL.
 UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated 
 with the N (passed as the third parameter) largest values of VAL.
 UDAFHistogram

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs

2013-07-26 Thread Jonathan Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721144#comment-13721144
 ] 

Jonathan Chang commented on HIVE-1545:
--

For the time being you can remove those packages (and the corresponding 
annotations) without affecting the functionality.

 Add a bunch of UDFs and UDAFs
 -

 Key: HIVE-1545
 URL: https://issues.apache.org/jira/browse/HIVE-1545
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Jonathan Chang
Assignee: Jonathan Chang
Priority: Minor
 Attachments: core.tar.gz, ext.tar.gz, UDFEndsWith.java, 
 UDFFindInString.java, UDFLtrim.java, UDFRtrim.java, udfs.tar.gz, udfs.tar.gz, 
 UDFStartsWith.java, UDFTrim.java


 Here some UD(A)Fs which can be incorporated into the Hive distribution:
 UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 
 5, 3) returns 1.
 UDFBucket - Find the bucket in which the first argument belongs. e.g., 
 BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x  b_{i} 
 but = b_{i+1}. Returns 0 if x is smaller than all the buckets.
 UDFFindInArray - Finds the 1-index of the first element in the array given as 
 the second argument. Returns 0 if not found. Returns NULL if either argument 
 is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, 
 array(1,2,3)) will return 0.
 UDFGreatCircleDist - Finds the great circle distance (in km) between two 
 lat/long coordinates (in degrees).
 UDFLDA - Performs LDA inference on a vector given fixed topics.
 UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 
 whenever any of its parameters changes.
 UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 
 5.
 UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches 
 in an array.
 UDFUnescape - Returns the string unescaped (using C/Java style unescaping).
 UDFWhich - Given a boolean array, return the indices which are TRUE.
 UDFJaccard
 UDAFCollect - Takes all the values associated with a row and converts it into 
 a list. Make sure to have: set hive.map.aggr = false;
 UDAFCollectMap - Like collect except that it takes tuples and generates a map.
 UDAFEntropy - Compute the entropy of a column.
 UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two 
 columns.
 UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value 
 of VAL.
 UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated 
 with the N (passed as the third parameter) largest values of VAL.
 UDAFHistogram

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2


 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4388:
---

Attachment: HIVE-4388-wip.txt

Attached is a WIP patch not meant for commit. 450KB of the 550KB is the 
generated protocol buffers class required for the co-processor in HCat.

I verified TestHBaseCliDriver with both hadoop1 and hadoop2, therefore I think 
we are in an decent spot with regards to 0.96 compatibility. Note that the 
hadoop2 build I had to hack together the classpath as the the upstream hadoop2 
snapshot is slightly out of date.

From here I need to clean the patch up quite a bit (not a ant/ivy expert so I 
was hacking away a little) and then do a lot more testing.

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4470) HS2 should disable local query execution


[ 
https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721167#comment-13721167
 ] 

Gunther Hagleitner commented on HIVE-4470:
--

[~appodictic] Can you explain what you mean by that some more? You mean an 
admin can set defaults, but we can't make sure someone submitting a query 
doesn't overwrite it? HiveConf only exists on the server in this case, so does 
the rest of the planning/submission code. Why wouldn't be be able to limit the 
user in what they can do?

 HS2 should disable local query execution
 

 Key: HIVE-4470
 URL: https://issues.apache.org/jira/browse/HIVE-4470
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair

 Hive can run queries in local mode (instead of using a cluster), if the size 
 is small. This happens when hive.exec.mode.local.auto is set to true.
 This would affect the stability of the hive server2 node, if you have heavy 
 query processing happening on it. Bugs in udfs triggered by a bad record can 
 potentially add very heavy load making the server inaccessible. 
 By default, HS2 should set these parameters to disallow local execution or 
 send and error message if user tries to set these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3926) PPD on virtual column of partitioned table is not working


[ 
https://issues.apache.org/jira/browse/HIVE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721171#comment-13721171
 ] 

Gunther Hagleitner commented on HIVE-3926:
--

+1 This looks good. Planning to commit tomorrow.

 PPD on virtual column of partitioned table is not working
 -

 Key: HIVE-3926
 URL: https://issues.apache.org/jira/browse/HIVE-3926
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3926.D8121.1.patch, HIVE-3926.D8121.2.patch, 
 HIVE-3926.D8121.3.patch, HIVE-3926.D8121.4.patch, HIVE-3926.D8121.5.patch


 {code}
 select * from src where BLOCK__OFFSET__INSIDE__FILE100;
 {code}
 is working, but
 {code}
 select * from srcpart where BLOCK__OFFSET__INSIDE__FILE100;
 {code}
 throws SemanticException. Disabling PPD makes it work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4470) HS2 should disable local query execution


[ 
https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721179#comment-13721179
 ] 

Edward Capriolo commented on HIVE-4470:
---

In HiveThrift1 I can do:

{code}
client.execute( SET hive.security.authorization.enabled=false);
client.execute( SELECT * FROM StuffIamNotSupposedtoSsee);
{code}

Is there some mechanism in hive thrift2 that prevents set commands?



 HS2 should disable local query execution
 

 Key: HIVE-4470
 URL: https://issues.apache.org/jira/browse/HIVE-4470
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair

 Hive can run queries in local mode (instead of using a cluster), if the size 
 is small. This happens when hive.exec.mode.local.auto is set to true.
 This would affect the stability of the hive server2 node, if you have heavy 
 query processing happening on it. Bugs in udfs triggered by a bad record can 
 potentially add very heavy load making the server inaccessible. 
 By default, HS2 should set these parameters to disallow local execution or 
 send and error message if user tries to set these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2702) listPartitionsByFilter only supports string partitions for equals


 [ 
https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2702:
--

Attachment: HIVE-2702.D11847.1.patch

sershe requested code review of HIVE-2702 [jira] listPartitionsByFilter only 
supports string partitions for equals.

Reviewers: JIRA

Rebase on top of HIVE-4929. It should still compile/pass server tests, but 
won't work properly before HIVE-4929

listPartitionsByFilter supports only non-string partitions. This is because its 
explicitly specified in generateJDOFilterOverPartitions in ExpressionTree.java.

//Can only support partitions whose types are string
  if( ! table.getPartitionKeys().get(partitionColumnIndex).
  
getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) )
{
throw new MetaException
(Filtering is supported only on partition keys of type string);
  }

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11847

AFFECTED FILES
  metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
  metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g
  metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/28167/

To: JIRA, sershe


 listPartitionsByFilter only supports string partitions for equals
 -

 Key: HIVE-2702
 URL: https://issues.apache.org/jira/browse/HIVE-2702
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Aniket Mokashi
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2702.D2043.1.patch, 
 HIVE-2702.1.patch, HIVE-2702.D11715.1.patch, HIVE-2702.D11715.2.patch, 
 HIVE-2702.D11715.3.patch, HIVE-2702.D11847.1.patch, HIVE-2702.patch, 
 HIVE-2702-v0.patch


 listPartitionsByFilter supports only non-string partitions. This is because 
 its explicitly specified in generateJDOFilterOverPartitions in 
 ExpressionTree.java. 
 //Can only support partitions whose types are string
   if( ! table.getPartitionKeys().get(partitionColumnIndex).
   
 getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) {
 throw new MetaException
 (Filtering is supported only on partition keys of type string);
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4586) [HCatalog] WebHCat should return 404 error for undefined resource

2013-07-26 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4586:
-

Status: Open  (was: Patch Available)

Applying the patch results in a significant number of checkstyle failures.

 [HCatalog] WebHCat should return 404 error for undefined resource
 -

 Key: HIVE-4586
 URL: https://issues.apache.org/jira/browse/HIVE-4586
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4586-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4470) HS2 should disable local query execution


[ 
https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721192#comment-13721192
 ] 

Gunther Hagleitner commented on HIVE-4470:
--

That is an awesome attack. :-) 

I thought there's already some black list for certain vars in HiveConf for this 
case. I'm hoping security enabled/disabled is in that list.

 HS2 should disable local query execution
 

 Key: HIVE-4470
 URL: https://issues.apache.org/jira/browse/HIVE-4470
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair

 Hive can run queries in local mode (instead of using a cluster), if the size 
 is small. This happens when hive.exec.mode.local.auto is set to true.
 This would affect the stability of the hive server2 node, if you have heavy 
 query processing happening on it. Bugs in udfs triggered by a bad record can 
 potentially add very heavy load making the server inaccessible. 
 By default, HS2 should set these parameters to disallow local execution or 
 send and error message if user tries to set these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2702) listPartitionsByFilter only supports string partitions for equals


[ 
https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721198#comment-13721198
 ] 

Ashutosh Chauhan commented on HIVE-2702:


Do you want to improve the description of ticket ... something like
Enhance listPartitionsByFilter to add support for integral types both for 
equality and non-equality 

 listPartitionsByFilter only supports string partitions for equals
 -

 Key: HIVE-2702
 URL: https://issues.apache.org/jira/browse/HIVE-2702
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Aniket Mokashi
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2702.D2043.1.patch, 
 HIVE-2702.1.patch, HIVE-2702.D11715.1.patch, HIVE-2702.D11715.2.patch, 
 HIVE-2702.D11715.3.patch, HIVE-2702.D11847.1.patch, HIVE-2702.patch, 
 HIVE-2702-v0.patch


 listPartitionsByFilter supports only non-string partitions. This is because 
 its explicitly specified in generateJDOFilterOverPartitions in 
 ExpressionTree.java. 
 //Can only support partitions whose types are string
   if( ! table.getPartitionKeys().get(partitionColumnIndex).
   
 getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) {
 throw new MetaException
 (Filtering is supported only on partition keys of type string);
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4055) add Date data type


 [ 
https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-4055:
-

Attachment: HIVE-4055.4.patch

Re-upload HIVE-4055.4.patch (without .txt suffix), to get automated tests to 
run.

 add Date data type
 --

 Key: HIVE-4055
 URL: https://issues.apache.org/jira/browse/HIVE-4055
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC, Query Processor, Serializers/Deserializers, UDF
Reporter: Sun Rui
Assignee: Jason Dere
 Attachments: Date.pdf, HIVE-4055.1.patch.txt, HIVE-4055.2.patch.txt, 
 HIVE-4055.3.patch.txt, HIVE-4055.4.patch, HIVE-4055.4.patch.txt, 
 HIVE-4055.D11547.1.patch


 Add Date data type, a new primitive data type which supports the standard SQL 
 date type.
 Basically, the implementation can take HIVE-2272 and HIVE-2957 as references.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4470) HS2 should disable local query execution

2013-07-26 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721208#comment-13721208
 ] 

Thejas M Nair commented on HIVE-4470:
-

bq. I thought there's already some black list for certain vars in HiveConf for 
this case. I'm hoping security enabled/disabled is in that list.
Yes, you can configure that using hive.conf.restricted.list . But it is empty 
by default.

[~appodictic] That is something that needs to go in the default restricted list 
! 

 HS2 should disable local query execution
 

 Key: HIVE-4470
 URL: https://issues.apache.org/jira/browse/HIVE-4470
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair

 Hive can run queries in local mode (instead of using a cluster), if the size 
 is small. This happens when hive.exec.mode.local.auto is set to true.
 This would affect the stability of the hive server2 node, if you have heavy 
 query processing happening on it. Bugs in udfs triggered by a bad record can 
 potentially add very heavy load making the server inaccessible. 
 By default, HS2 should set these parameters to disallow local execution or 
 send and error message if user tries to set these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2702) listPartitionsByFilter only supports string partitions for equals


[ 
https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721209#comment-13721209
 ] 

Phabricator commented on HIVE-2702:
---

ashutoshc has accepted the revision HIVE-2702 [jira] listPartitionsByFilter 
only supports string partitions for equals.

  +1 Looks good.
  Few minor nits.

INLINE COMMENTS
  metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g:160 Do u 
want to name this as IntegralLiteral now ?
  
metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:312
 Do you instead want to say in this TODO that this will be dealt in HIVE-4888
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:2188 Do you want to 
say for 2nd TODO that this will be dealt in HIVE-4888

REVISION DETAIL
  https://reviews.facebook.net/D11847

BRANCH
  HIVE-2702-2

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, sershe


 listPartitionsByFilter only supports string partitions for equals
 -

 Key: HIVE-2702
 URL: https://issues.apache.org/jira/browse/HIVE-2702
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Aniket Mokashi
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2702.D2043.1.patch, 
 HIVE-2702.1.patch, HIVE-2702.D11715.1.patch, HIVE-2702.D11715.2.patch, 
 HIVE-2702.D11715.3.patch, HIVE-2702.D11847.1.patch, HIVE-2702.patch, 
 HIVE-2702-v0.patch


 listPartitionsByFilter supports only non-string partitions. This is because 
 its explicitly specified in generateJDOFilterOverPartitions in 
 ExpressionTree.java. 
 //Can only support partitions whose types are string
   if( ! table.getPartitionKeys().get(partitionColumnIndex).
   
 getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) {
 throw new MetaException
 (Filtering is supported only on partition keys of type string);
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs


 [ 
https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4942:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

It would be really nice if someone picks up HIVE-2739.  In the meanwhile thanks 
Yin for the quick fix.

 Fix eclipse template files to use correct datanucleus libs
 --

 Key: HIVE-4942
 URL: https://issues.apache.org/jira/browse/HIVE-4942
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-4942.txt


 HIVE-3632 did not update the eclipse template files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4299) exported metadata by HIVE-3068 cannot be imported because of wrong file name


[ 
https://issues.apache.org/jira/browse/HIVE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721215#comment-13721215
 ] 

Hive QA commented on HIVE-4299:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12594323/HIVE-4299.1.patch.txt

{color:green}SUCCESS:{color} +1 2653 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/196/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/196/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 exported metadata by HIVE-3068 cannot be imported because of wrong file name
 

 Key: HIVE-4299
 URL: https://issues.apache.org/jira/browse/HIVE-4299
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Sho Shimauchi
Assignee: Sho Shimauchi
 Attachments: HIVE-4299.1.patch.txt, HIVE-4299.patch


 h2. Symptom
 When DROP TABLE a table, metadata of the table is generated to be able to 
 import the dropped table again.
 However, the exported metadata name is 'table name.metadata'.
 Since ImportSemanticAnalyzer allows only '_metadata' as metadata filename, 
 user have to rename the metadata file to import the table.
 h2. How to reproduce
 Set the following setting to hive-site.xml:
 {code}
  property
namehive.metastore.pre.event.listeners/name
valueorg.apache.hadoop.hive.ql.parse.MetaDataExportListener/value
  /property
 {code}
 Then run the following queries:
 {code}
  CREATE TABLE test_table (id INT, name STRING);
  DROP TABLE test_table;
  IMPORT TABLE test_table_imported FROM '/path/to/metadata/file';
 FAILED: SemanticException [Error 10027]: Invalid path
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721222#comment-13721222
 ] 

Ashutosh Chauhan commented on HIVE-4825:


One more comment. Sorry missed that one earlier.

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch, HIVE-4825.3.testfiles.patch, HIVE-4825.4.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4299) exported metadata by HIVE-3068 cannot be imported because of wrong file name


[ 
https://issues.apache.org/jira/browse/HIVE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721235#comment-13721235
 ] 

Ashutosh Chauhan commented on HIVE-4299:


I guess you have tested it manually on cluster, so skipping unit tests should 
be alright.

Although, I think it will be better to do:
+  public static final String METADATA_NAME=.metadata;
instead of 
+  public static final String METADATA_NAME=_metadata;

because some folks may have already exported the data, which cannot be imported 
with your change but can be if we choose former instead.

 exported metadata by HIVE-3068 cannot be imported because of wrong file name
 

 Key: HIVE-4299
 URL: https://issues.apache.org/jira/browse/HIVE-4299
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Sho Shimauchi
Assignee: Sho Shimauchi
 Attachments: HIVE-4299.1.patch.txt, HIVE-4299.patch


 h2. Symptom
 When DROP TABLE a table, metadata of the table is generated to be able to 
 import the dropped table again.
 However, the exported metadata name is 'table name.metadata'.
 Since ImportSemanticAnalyzer allows only '_metadata' as metadata filename, 
 user have to rename the metadata file to import the table.
 h2. How to reproduce
 Set the following setting to hive-site.xml:
 {code}
  property
namehive.metastore.pre.event.listeners/name
valueorg.apache.hadoop.hive.ql.parse.MetaDataExportListener/value
  /property
 {code}
 Then run the following queries:
 {code}
  CREATE TABLE test_table (id INT, name STRING);
  DROP TABLE test_table;
  IMPORT TABLE test_table_imported FROM '/path/to/metadata/file';
 FAILED: SemanticException [Error 10027]: Invalid path
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4470) HS2 should disable local query execution


[ 
https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721246#comment-13721246
 ] 

Gunther Hagleitner commented on HIVE-4470:
--

Hehe. Yeah, btw: hive.conf.restricted.list should probably also be in the 
restricted list.

 HS2 should disable local query execution
 

 Key: HIVE-4470
 URL: https://issues.apache.org/jira/browse/HIVE-4470
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair

 Hive can run queries in local mode (instead of using a cluster), if the size 
 is small. This happens when hive.exec.mode.local.auto is set to true.
 This would affect the stability of the hive server2 node, if you have heavy 
 query processing happening on it. Bugs in udfs triggered by a bad record can 
 potentially add very heavy load making the server inaccessible. 
 By default, HS2 should set these parameters to disallow local execution or 
 send and error message if user tries to set these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4885) Alternative object serialization for execution plan in hive testing


[ 
https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721250#comment-13721250
 ] 

Ashutosh Chauhan commented on HIVE-4885:


[~appodictic] How did your tests go? Did you play with xstreams?

If we pick binary (de)serializer, one option is to detect if we are running in 
test mode and if so, instead serialize using existing serialization mechanism, 
thus preserving existing test infra built for doing plan validations.

 Alternative object serialization for execution plan in hive testing 
 

 Key: HIVE-4885
 URL: https://issues.apache.org/jira/browse/HIVE-4885
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.10.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.12.0

 Attachments: HIVE-4885.patch


 Currently there are a lot of test cases involving in comparing execution 
 plan, such as those in TestParse suite. XmlEncoder is used to serialize the 
 generated plan by hive, and store it in the file for file diff comparison. 
 However, XmlEncoder is tied with Java compiler, whose implementation may 
 change from version to version. Thus, upgrade the compiler can generate a lot 
 of fake test failures. The following is an example of diff generated when 
 running hive with JDK7:
 {code}
 Begin query: case_sensitivity.q
 diff -a 
 /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out
  
 /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out
 diff -a -b 
 /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml
  
 /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml
 3c3
   object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask0
 ---
   object id=MapRedTask0 
  class=org.apache.hadoop.hive.ql.exec.MapRedTask 
 12c12
 object class=java.util.ArrayList id=ArrayList0
 ---
 object id=ArrayList0 class=java.util.ArrayList 
 14c14
   object class=org.apache.hadoop.hive.ql.exec.MoveTask 
 id=MoveTask0
 ---
   object id=MoveTask0 
  class=org.apache.hadoop.hive.ql.exec.MoveTask 
 18c18
   object class=org.apache.hadoop.hive.ql.exec.MoveTask 
 id=MoveTask1
 ---
   object id=MoveTask1 
  class=org.apache.hadoop.hive.ql.exec.MoveTask 
 22c22
   object class=org.apache.hadoop.hive.ql.exec.StatsTask 
 id=StatsTask0
 ---
   object id=StatsTask0 
  class=org.apache.hadoop.hive.ql.exec.StatsTask 
 60c60
   object class=org.apache.hadoop.hive.ql.exec.MapRedTask 
 id=MapRedTask1
 ---
   object id=MapRedTask1 
  class=org.apache.hadoop.hive.ql.exec.MapRedTask 
 {code}
 As it can be seen, the only difference is the order of the attributes in the 
 serialized XML doc, yet it brings 50+ test failures in Hive.
 We need to have a better plan comparison, or object serialization to improve 
 the situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4899) Hive returns non-meanful error message for ill-formed fs.default.name


[ 
https://issues.apache.org/jira/browse/HIVE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721260#comment-13721260
 ] 

Ashutosh Chauhan commented on HIVE-4899:


+1

 Hive returns non-meanful error message for ill-formed fs.default.name
 -

 Key: HIVE-4899
 URL: https://issues.apache.org/jira/browse/HIVE-4899
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.10.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4899.patch


 For query in test case fs_default_name1.q:
 {code}
 set fs.default.name='http://www.example.com;
 show tables;
 {code}
 The following error message is returned:
 {code}
 FAILED: IllegalArgumentException null
 {code}
 The message is not very meaningful, and has null in it.
 It would be better if we can provide detailed error message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4944) Hive Windows Scripts and Compatibility changes

Sushanth Sowmyan created HIVE-4944:
--

 Summary: Hive Windows Scripts and Compatibility changes
 Key: HIVE-4944
 URL: https://issues.apache.org/jira/browse/HIVE-4944
 Project: Hive
  Issue Type: Bug
  Components: Windows
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


Porting patches that enable hive packaging and running under windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4944) Hive Windows Scripts and Compatibility changes


[ 
https://issues.apache.org/jira/browse/HIVE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721262#comment-13721262
 ] 

Sushanth Sowmyan commented on HIVE-4944:


Attaching 2 patches - one that is a umbrella patch of compatibility changes to 
enable hive building, testing and packaging under windows, and another that is 
a patch of scripts for installation, packaging and running.

 Hive Windows Scripts and Compatibility changes
 --

 Key: HIVE-4944
 URL: https://issues.apache.org/jira/browse/HIVE-4944
 Project: Hive
  Issue Type: Bug
  Components: Windows
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: compat.patch, packaging.patch


 Porting patches that enable hive packaging and running under windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4944) Hive Windows Scripts and Compatibility changes


 [ 
https://issues.apache.org/jira/browse/HIVE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4944:
---

Attachment: packaging.patch
compat.patch

 Hive Windows Scripts and Compatibility changes
 --

 Key: HIVE-4944
 URL: https://issues.apache.org/jira/browse/HIVE-4944
 Project: Hive
  Issue Type: Bug
  Components: Windows
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: compat.patch, packaging.patch


 Porting patches that enable hive packaging and running under windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4885) Alternative object serialization for execution plan in hive testing


[ 
https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721269#comment-13721269
 ] 

Edward Capriolo commented on HIVE-4885:
---

it turns out xstream is already used in hcatalog somewhere. I did write some 
code to use it. there is one java based unit test taht passed but I did not 
have time for a performance evaluation, and to run the full test suite. I will 
probably do it over the next two days. 

 Alternative object serialization for execution plan in hive testing 
 

 Key: HIVE-4885
 URL: https://issues.apache.org/jira/browse/HIVE-4885
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.10.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.12.0

 Attachments: HIVE-4885.patch


 Currently there are a lot of test cases involving in comparing execution 
 plan, such as those in TestParse suite. XmlEncoder is used to serialize the 
 generated plan by hive, and store it in the file for file diff comparison. 
 However, XmlEncoder is tied with Java compiler, whose implementation may 
 change from version to version. Thus, upgrade the compiler can generate a lot 
 of fake test failures. The following is an example of diff generated when 
 running hive with JDK7:
 {code}
 Begin query: case_sensitivity.q
 diff -a 
 /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out
  
 /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out
 diff -a -b 
 /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml
  
 /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml
 3c3
   object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask0
 ---
   object id=MapRedTask0 
  class=org.apache.hadoop.hive.ql.exec.MapRedTask 
 12c12
 object class=java.util.ArrayList id=ArrayList0
 ---
 object id=ArrayList0 class=java.util.ArrayList 
 14c14
   object class=org.apache.hadoop.hive.ql.exec.MoveTask 
 id=MoveTask0
 ---
   object id=MoveTask0 
  class=org.apache.hadoop.hive.ql.exec.MoveTask 
 18c18
   object class=org.apache.hadoop.hive.ql.exec.MoveTask 
 id=MoveTask1
 ---
   object id=MoveTask1 
  class=org.apache.hadoop.hive.ql.exec.MoveTask 
 22c22
   object class=org.apache.hadoop.hive.ql.exec.StatsTask 
 id=StatsTask0
 ---
   object id=StatsTask0 
  class=org.apache.hadoop.hive.ql.exec.StatsTask 
 60c60
   object class=org.apache.hadoop.hive.ql.exec.MapRedTask 
 id=MapRedTask1
 ---
   object id=MapRedTask1 
  class=org.apache.hadoop.hive.ql.exec.MapRedTask 
 {code}
 As it can be seen, the only difference is the order of the attributes in the 
 serialized XML doc, yet it brings 50+ test failures in Hive.
 We need to have a better plan comparison, or object serialization to improve 
 the situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()


 [ 
https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4823:
--

Attachment: HIVE-4823.1-vectorization.patch

 implement vectorized TRIM(), LTRIM(), RTRIM()
 -

 Key: HIVE-4823
 URL: https://issues.apache.org/jira/browse/HIVE-4823
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4823.1-vectorization.patch


 Make it work end-to-end, including the vectorized expression, and tying it 
 together in VectorizationContext so a SQL query will run using vectorization 
 when invoking these functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()


 [ 
https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4823:
--

Affects Version/s: vectorization-branch

 implement vectorized TRIM(), LTRIM(), RTRIM()
 -

 Key: HIVE-4823
 URL: https://issues.apache.org/jira/browse/HIVE-4823
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4823.1-vectorization.patch


 Make it work end-to-end, including the vectorized expression, and tying it 
 together in VectorizationContext so a SQL query will run using vectorization 
 when invoking these functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()


 [ 
https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4823:
--

Fix Version/s: vectorization-branch
   Status: Patch Available  (was: In Progress)

 implement vectorized TRIM(), LTRIM(), RTRIM()
 -

 Key: HIVE-4823
 URL: https://issues.apache.org/jira/browse/HIVE-4823
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4823.1-vectorization.patch


 Make it work end-to-end, including the vectorized expression, and tying it 
 together in VectorizationContext so a SQL query will run using vectorization 
 when invoking these functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()


 [ 
https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4823:
--

Fix Version/s: (was: vectorization-branch)

 implement vectorized TRIM(), LTRIM(), RTRIM()
 -

 Key: HIVE-4823
 URL: https://issues.apache.org/jira/browse/HIVE-4823
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4823.1-vectorization.patch


 Make it work end-to-end, including the vectorized expression, and tying it 
 together in VectorizationContext so a SQL query will run using vectorization 
 when invoking these functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4551) HCatLoader smallint/tinyint promotions to Int have issues with ORC integration


[ 
https://issues.apache.org/jira/browse/HIVE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721332#comment-13721332
 ] 

Sushanth Sowmyan commented on HIVE-4551:


Seems to succeed for me:

--
checkstyle:
 [echo] hcatalog
[checkstyle] Running Checkstyle 5.5 on 421 files

BUILD SUCCESSFUL
Total time: 1 minute 33 seconds
--

Could you post up what error checkstyle brings up on your end?

 HCatLoader smallint/tinyint promotions to Int have issues with ORC integration
 --

 Key: HIVE-4551
 URL: https://issues.apache.org/jira/browse/HIVE-4551
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: 4551.patch


 This was initially reported from an e2e test run, with the following E2E test:
 {code}
 {
 'name' = 'Hadoop_ORC_Write',
 'tests' = [
 {
  'num' = 1
 ,'hcat_prep'=q\
 drop table if exists hadoop_orc;
 create table hadoop_orc (
 t tinyint,
 si smallint,
 i int,
 b bigint,
 f float,
 d double,
 s string)
 stored as orc;\
 ,'hadoop' = q\
 jar :FUNCPATH:/testudf.jar org.apache.hcatalog.utils.WriteText -libjars 
 :HCAT_JAR: :THRIFTSERVER: all100k hadoop_orc\,
 ,'result_table' = 'hadoop_orc'
 ,'sql' = q\select * from all100k;\
 ,'floatpostprocess' = 1
 ,'delimiter' = '   '
 },
],
 },
 {code}
 This fails with the following error:
 {code}
 2013-04-26 00:26:07,437 WARN org.apache.hadoop.mapred.Child: Error running 
 child
 org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
 converting read value to tuple
   at 
 org.apache.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
   at org.apache.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:53)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
   at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
   at 
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1195)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.io.ByteWritable cannot be cast to 
 org.apache.hadoop.io.IntWritable
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.getPrimitiveJavaObject(WritableIntObjectInspector.java:45)
   at 
 org.apache.hcatalog.data.HCatRecordSerDe.serializePrimitiveField(HCatRecordSerDe.java:290)
   at 
 org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:192)
   at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53)
   at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97)
   at 
 org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203)
   at 
 org.apache.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:63)
   ... 12 more
 2013-04-26 00:26:07,440 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
 for the task
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()


[ 
https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721337#comment-13721337
 ] 

Eric Hanson commented on HIVE-4823:
---

Depends on concat patch (HIVE-4512).

 implement vectorized TRIM(), LTRIM(), RTRIM()
 -

 Key: HIVE-4823
 URL: https://issues.apache.org/jira/browse/HIVE-4823
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4823.1-vectorization.patch


 Make it work end-to-end, including the vectorized expression, and tying it 
 together in VectorizationContext so a SQL query will run using vectorization 
 when invoking these functions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan


[ 
https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721342#comment-13721342
 ] 

Hive QA commented on HIVE-4929:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12594427/HIVE-4929.patch

{color:green}SUCCESS:{color} +1 2653 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/198/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/198/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 the type of all numeric constants is changed to double in the plan
 --

 Key: HIVE-4929
 URL: https://issues.apache.org/jira/browse/HIVE-4929
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-4929.patch


 There's code which, after the numeric type for a constant in where clause has 
 been chosen as the most restricted one or based on suffix, tries to change 
 the type to match the numeric column  which the constant is being compared 
 with. However, due to a hack from HIVE-3059 every column type shows up as 
 string in that code, causing it to always change the constant type to double. 
 This should not be done (regardless of the hack).
 Spinoff from HIVE-2702, large number of query outputs change so it will be a 
 big patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4945) Make RLIKE/REGEXP run end-to-end by updating VectorizationContext

Eric Hanson created HIVE-4945:
-

 Summary: Make RLIKE/REGEXP run end-to-end by updating 
VectorizationContext
 Key: HIVE-4945
 URL: https://issues.apache.org/jira/browse/HIVE-4945
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4551) HCatLoader smallint/tinyint promotions to Int have issues with ORC integration

2013-07-26 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4551:
-

Status: Patch Available  (was: Open)

 HCatLoader smallint/tinyint promotions to Int have issues with ORC integration
 --

 Key: HIVE-4551
 URL: https://issues.apache.org/jira/browse/HIVE-4551
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: 4551.patch


 This was initially reported from an e2e test run, with the following E2E test:
 {code}
 {
 'name' = 'Hadoop_ORC_Write',
 'tests' = [
 {
  'num' = 1
 ,'hcat_prep'=q\
 drop table if exists hadoop_orc;
 create table hadoop_orc (
 t tinyint,
 si smallint,
 i int,
 b bigint,
 f float,
 d double,
 s string)
 stored as orc;\
 ,'hadoop' = q\
 jar :FUNCPATH:/testudf.jar org.apache.hcatalog.utils.WriteText -libjars 
 :HCAT_JAR: :THRIFTSERVER: all100k hadoop_orc\,
 ,'result_table' = 'hadoop_orc'
 ,'sql' = q\select * from all100k;\
 ,'floatpostprocess' = 1
 ,'delimiter' = '   '
 },
],
 },
 {code}
 This fails with the following error:
 {code}
 2013-04-26 00:26:07,437 WARN org.apache.hadoop.mapred.Child: Error running 
 child
 org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
 converting read value to tuple
   at 
 org.apache.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
   at org.apache.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:53)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
   at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
   at 
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1195)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.io.ByteWritable cannot be cast to 
 org.apache.hadoop.io.IntWritable
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.getPrimitiveJavaObject(WritableIntObjectInspector.java:45)
   at 
 org.apache.hcatalog.data.HCatRecordSerDe.serializePrimitiveField(HCatRecordSerDe.java:290)
   at 
 org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:192)
   at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53)
   at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97)
   at 
 org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203)
   at 
 org.apache.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:63)
   ... 12 more
 2013-04-26 00:26:07,440 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
 for the task
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [Discuss] project chop up

2013-07-26 Thread Edward Capriolo

My mistake on saying hcat was a fork metastore. I had a brain fart for a
moment.

One way we could do this is create a folder called downstream. In our
release step we can execute the downstream builds and then copy the files
we need back. So nothing downstream will be on the classpath of the main
project.

This could help us breakup ql as well. Things like exotic file formats ,
and things that are pluggable like zk locking can go here. That might be
overkill.

For now we can focus on building downstream and hivethrift1might be the
first thing to try to downstream.


On Friday, July 26, 2013, Thejas Nair the...@hortonworks.com wrote:
 +1 to the idea of making the build of core hive and other downstream
 components independent.

 bq.  I was under the impression that Hcat and hive-metastore was
 supposed to merge up somehow.

 The metastore code was never forked. Hcat was just using
 hive-metastore and making the metadata available to rest of hadoop
 (pig, java MR..).
 A lot of the changes that were driven by hcat goals were being made in
 hive-metastore. You can think of hcat as set of libraries that let pig
 and java MR use hive metastore. Since hcat is closely tied to
 hive-metastore, it makes sense to have them in same project.


 On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo edlinuxg...@gmail.com
wrote:
 Also i believe hcatalog web can fall into the same designation.

 Question , hcatalog was initily a big hive-metastore fork. I was under
the
 impression that Hcat and hive-metastore was supposed to merge up somehow.
 What is the status on that? I remember that was one of the core reasons
we
 brought it in.

 On Friday, July 26, 2013, Edward Capriolo edlinuxg...@gmail.com wrote:
 I prefer option 3 as well.


 On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland br...@cloudera.com
wrote:

 On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo edlinuxg...@gmail.com
wrote:

  I have been developing my laptop on a duel core 2 GB Ram laptop for
 years
  now. With the addition of hcatalog, hive-thrift2, and some other
growth
  trying to develop hive in a eclipse on this machine craws, especially
 if
  'build automatically' is turned on. As we look to add on more things
 this
  is only going to get worse.
 
  I am also noticing issues like this:
 
  https://issues.apache.org/jira/browse/HIVE-4849
 
  What I think we should do is strip down/out optional parts of hive.
 
  1) Hive Hbase
   This should really be it's own project to do this right we really
 have to
  have multiple branches since hbase is not backwards compatible.
 
  2) Hive Web Interface
  Now really a big project but not really critical can be just as
easily
 be
  build separately
 
  3) hive thrift 1
  We have hive thrift 2 now, it is time for the sun to set on
 hivethrift1,
 
  4) odbc
  Not entirely convinced about this one but it is really not critical
to
  running hive.
 
  What I think we should do is create sub-projects for the above things
 or
  simply move them into directories that do not build with hive.
Ideally
 they
  would use maven to pull dependencies.
 
  What does everyone think?
 

 I agree that projects like the HBase handler and probably others as
well
 should somehow be downstream projects which simply depend on the hive
 jars.  I see a couple alternatives for this:

 * Take the module in question to the Apache Incubator
 * Move the module in question to the Apache Extras
 * Breakup the projects within our own source tree

 I'd prefer the third option at this point.

 Brock



 Brock

[jira] [Commented] (HIVE-4928) Date literals do not work properly in partition spec clause


[ 
https://issues.apache.org/jira/browse/HIVE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721351#comment-13721351
 ] 

Hive QA commented on HIVE-4928:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12594443/HIVE-4928.1.patch.txt

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/200/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/200/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests failed with: NonZeroExitCodeException: Command 'bash 
/data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and 
output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-200/source-prep.txt
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1507508.

At revision 1507508.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0 to p2
+ exit 1
'
{noformat}

This message is automatically generated.

 Date literals do not work properly in partition spec clause
 ---

 Key: HIVE-4928
 URL: https://issues.apache.org/jira/browse/HIVE-4928
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-4928.1.patch.txt


 The partition spec parsing doesn't do any actual real evaluation of the 
 values in the partition spec, instead just taking the text value of the 
 ASTNode representing the partition value. This works fine for string/numeric 
 literals (expression tree below):
 (TOK_PARTVAL region 99)
 But not for Date literals which are of form DATE '-mm-dd' (expression 
 tree below:
 (TOK_DATELITERAL '1999-12-31')
 In this case the parser/analyzer uses TOK_DATELITERAL as the partition 
 column value, when it should really get value of the child of the DATELITERAL 
 token.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4586) [HCatalog] WebHCat should return 404 error for undefined resource

2013-07-26 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4586:
-

Status: Patch Available  (was: Open)

Turns out the checkstyle problem is separate.  Returning this to patch 
available.

 [HCatalog] WebHCat should return 404 error for undefined resource
 -

 Key: HIVE-4586
 URL: https://issues.apache.org/jira/browse/HIVE-4586
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4586-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4928) Date literals do not work properly in partition spec clause