[jira] [Updated] (HIVE-12021) HivePreFilteringRule may introduce wrong common operands

2015-10-09 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12021:
---
Attachment: HIVE-12021.branch-1.patch
HIVE-12021.branch-1.2.patch

> HivePreFilteringRule may introduce wrong common operands
> 
>
> Key: HIVE-12021
> URL: https://issues.apache.org/jira/browse/HIVE-12021
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-12021.01.patch, HIVE-12021.02.patch, 
> HIVE-12021.02.patch, HIVE-12021.branch-1.2.patch, HIVE-12021.branch-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12060) LLAP: create separate variable for llap tests

2015-10-09 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950063#comment-14950063
 ] 

Lefty Leverenz commented on HIVE-12060:
---

There's a typo in the description of 
*hive.tez.input.generate.consistent.splits*:  "Whether to generate consisten 
split" -- need "t" for consistent.

Several of the new hive.llap.* configs don't have descriptions.  Are they for 
internal use only?

Please add newlines (\n) in the description of 
*hive.llap.queue.metrics.percentiles.intervals* and keep the indentation 
identical for all three lines of the description.  (And to pick a nit, a few 
config description indentations are off by one character, including that one.)

> LLAP: create separate variable for llap tests
> -
>
> Key: HIVE-12060
> URL: https://issues.apache.org/jira/browse/HIVE-12060
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12060.23.patch
>
>
> No real reason to just reuse tez one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11149) Fix issue with sometimes HashMap in PerfLogger.java hangs

2015-10-09 Thread NING DING (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950121#comment-14950121
 ] 

NING DING commented on HIVE-11149:
--

Could anyone put this patch to branch-1.2, thanks.

> Fix issue with sometimes HashMap in PerfLogger.java hangs 
> --
>
> Key: HIVE-11149
> URL: https://issues.apache.org/jira/browse/HIVE-11149
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Fix For: 2.0.0
>
> Attachments: HIVE-11149.01.patch, HIVE-11149.02.patch, 
> HIVE-11149.03.patch, HIVE-11149.04.patch
>
>
> In  Multi-thread environment,  sometimes the  HashMap in PerfLogger.java  
> will  casue massive Java Processes hang  and cost  large amounts of 
> unnecessary CPU and Memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11609) Capability to add a filter to hbase scan via composite key doesn't work

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950147#comment-14950147
 ] 

Hive QA commented on HIVE-11609:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765499/HIVE-11609.3.patch.txt

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9640 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-vector_decimal_round.q-cbo_windowing.q-tez_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_custom_key3
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5581/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5581/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5581/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765499 - PreCommit-HIVE-TRUNK-Build

> Capability to add a filter to hbase scan via composite key doesn't work
> ---
>
> Key: HIVE-11609
> URL: https://issues.apache.org/jira/browse/HIVE-11609
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Attachments: HIVE-11609.1.patch.txt, HIVE-11609.2.patch.txt, 
> HIVE-11609.3.patch.txt
>
>
> It seems like the capability to add filter to an hbase scan which was added 
> as part of HIVE-6411 doesn't work. This is primarily because in the 
> HiveHBaseInputFormat, the filter is added in the getsplits instead of 
> getrecordreader. This works fine for start and stop keys but not for filter 
> because a filter is respected only when an actual scan is performed. This is 
> also related to the initial refactoring that was done as part of HIVE-3420.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12064) prevent transactional=false

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950150#comment-14950150
 ] 

Hive QA commented on HIVE-12064:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765504/HIVE-12064.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5582/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5582/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5582/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5582/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   fb24eb3..91fe1e1  branch-1   -> origin/branch-1
   d60f33c..cc3b2b0  branch-1.2 -> origin/branch-1.2
   aded0d3..be05e32  master -> origin/master
+ git reset --hard HEAD
HEAD is now at aded0d3 HIVE-11149 : Fix issue with sometimes HashMap in 
PerfLogger.java hangs (WangMeng, reviewed by Xuefu Zhang, Sergey Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at be05e32 HIVE-12021: HivePreFilteringRule may introduce wrong 
common operands (Jesus Camacho Rodriguez, reviewed by Laljo John Pullokkaran)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765504 - PreCommit-HIVE-TRUNK-Build

> prevent transactional=false
> ---
>
> Key: HIVE-12064
> URL: https://issues.apache.org/jira/browse/HIVE-12064
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12064.patch
>
>
> currently a tblproperty transactional=true must be set to make a table behave 
> in ACID compliant way.
> This is misleading in that it seems like changing it to transactional=false 
> makes the table non-acid but on disk layout of acid table is different than 
> plain tables.  So changing this  property may cause wrong data to be returned.
> Should prevent transactional=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12075) add analyze command to explictly cache file metadata in HBase metastore

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12075:

Attachment: HIVE-12075.patch
HIVE-12075.nogen.patch

WIP patch for backup with all the plumbing. The code is there, but probably 
doesn't work, and the API doesn't actually do anything - work is put in there 
queue and never done.


> add analyze command to explictly cache file metadata in HBase metastore
> ---
>
> Key: HIVE-12075
> URL: https://issues.apache.org/jira/browse/HIVE-12075
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12075.nogen.patch, HIVE-12075.patch
>
>
> ANALYZE TABLE (spec as usual) CACHE METADATA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12075) add analyze command to explictly cache file metadata in HBase metastore

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12075:

Description: 
ANALYZE TABLE (spec as usual) CACHE METADATA

NO PRECOMMIT TESTS

  was:ANALYZE TABLE (spec as usual) CACHE METADATA


> add analyze command to explictly cache file metadata in HBase metastore
> ---
>
> Key: HIVE-12075
> URL: https://issues.apache.org/jira/browse/HIVE-12075
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12075.nogen.patch, HIVE-12075.patch
>
>
> ANALYZE TABLE (spec as usual) CACHE METADATA
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-09 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-12083:

Attachment: HIVE-12083.patch

Patch attached, with tests.

[~sershe]/[~thejas], could you please review?

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not occur since HIVE-10965 does also include a guard on 
> the client-side for colNames.isEmpty() to not call the metastore call at all, 
> but there is no guard for partNames being empty, and would still cause an 
> error on the metastore side if the thrift call were called directly, as would 
> happen if the client is from an older version before this was patched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11514) Vectorized version of auto_sortmerge_join_1.q fails during execution with NPE

2015-10-09 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11514:

Attachment: HIVE-11514.02.patch

Enhanced comment and rebased.

> Vectorized version of auto_sortmerge_join_1.q fails during execution with NPE
> -
>
> Key: HIVE-11514
> URL: https://issues.apache.org/jira/browse/HIVE-11514
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11514.01.patch, HIVE-11514.02.patch, 
> auto_sortmerge_join_1.q
>
>
> Query from auto_sortmerge_join_1.q:
> {code}
> select count(*) FROM bucket_big a JOIN bucket_small b ON a.key = b.key
> {code}
> generates stack trace:
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.initializeOp(VectorMapJoinOperator.java:177)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:131)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12025) refactor bucketId generating code

2015-10-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951424#comment-14951424
 ] 

Eugene Koifman commented on HIVE-12025:
---

the 2 UT failures are not related.



> refactor bucketId generating code
> -
>
> Key: HIVE-12025
> URL: https://issues.apache.org/jira/browse/HIVE-12025
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12025.2.patch, HIVE-12025.3.patch, HIVE-12025.patch
>
>
> HIVE-11983 adds ObjectInspectorUtils.getBucketHashCode() and 
> getBucketNumber().
> There are several (at least) places in Hive that perform this computation:
> # ReduceSinkOperator.computeBucketNumber
> # ReduceSinkOperator.computeHashCode
> # BucketIdResolverImpl - only in 2.0.0 ASF line
> # FileSinkOperator.findWriterOffset
> # GenericUDFHash
> Should refactor it and make sure they all call methods from 
> ObjectInspectorUtils.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11931) Join sql cannot get result

2015-10-09 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang reassigned HIVE-11931:
-

Assignee: Xiaomeng Huang

> Join sql cannot get result
> --
>
> Key: HIVE-11931
> URL: https://issues.apache.org/jira/browse/HIVE-11931
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Query Planning, Query Processor, SQL
>Affects Versions: 1.1.1, 1.2.1
>Reporter: NING DING
>Assignee: Xiaomeng Huang
> Attachments: 00_0
>
>
> I found a join issue in hive-1.2.1 and hive-1.1.1.
> The create table sql is as below.
> {code}
> CREATE TABLE IF NOT EXISTS join_case(
> orderid  bigint,
> tradeitemid bigint,
> id bigint
> ) ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ',' 
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE;
> {code}
> Please put attached sample data file 00_0 in /tmp/join_case folder.
> Then load data.
> {code}
> LOAD DATA LOCAL INPATH '/tmp/join_case/00_0' OVERWRITE INTO TABLE 
> join_case;
> {code}
> Run the following sql, but cannot get searching result.
> {code}
> select a.id from 
> (
> select orderid as orderid, max(id) as id from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> This issue also occurs in hive-1.1.0-cdh5.4.5.
> But in apache hive-1.0.1 the above sql can return 10 rows.
> After exchanging the sequence of "orderid as orderid" and "max(id) as id", 
> the following sql can get result in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select max(id) as id, orderid as orderid from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> Also, the following sql can get results in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select orderid as orderid, id as id from join_case group by orderid, id
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10; 
> {code}
> Anyone can take a look at this issue? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11931) Join sql cannot get result

2015-10-09 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang resolved HIVE-11931.
---
Resolution: Duplicate

Mark this as duplicated.

> Join sql cannot get result
> --
>
> Key: HIVE-11931
> URL: https://issues.apache.org/jira/browse/HIVE-11931
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Query Planning, Query Processor, SQL
>Affects Versions: 1.1.1, 1.2.1
>Reporter: NING DING
>Assignee: Xiaomeng Huang
> Attachments: 00_0
>
>
> I found a join issue in hive-1.2.1 and hive-1.1.1.
> The create table sql is as below.
> {code}
> CREATE TABLE IF NOT EXISTS join_case(
> orderid  bigint,
> tradeitemid bigint,
> id bigint
> ) ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ',' 
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE;
> {code}
> Please put attached sample data file 00_0 in /tmp/join_case folder.
> Then load data.
> {code}
> LOAD DATA LOCAL INPATH '/tmp/join_case/00_0' OVERWRITE INTO TABLE 
> join_case;
> {code}
> Run the following sql, but cannot get searching result.
> {code}
> select a.id from 
> (
> select orderid as orderid, max(id) as id from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> This issue also occurs in hive-1.1.0-cdh5.4.5.
> But in apache hive-1.0.1 the above sql can return 10 rows.
> After exchanging the sequence of "orderid as orderid" and "max(id) as id", 
> the following sql can get result in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select max(id) as id, orderid as orderid from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> Also, the following sql can get results in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select orderid as orderid, id as id from join_case group by orderid, id
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10; 
> {code}
> Anyone can take a look at this issue? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11802) Float-point numbers are displayed with different precision in Beeline/JDBC

2015-10-09 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951535#comment-14951535
 ] 

Carl Steinbach commented on HIVE-11802:
---

I think this is a case where correctness is more important than performance. 
Let's aim first for the former, and once it's achieved we can worry about the 
latter.

> Float-point numbers are displayed with different precision in Beeline/JDBC
> --
>
> Key: HIVE-11802
> URL: https://issues.apache.org/jira/browse/HIVE-11802
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Fix For: 2.0.0
>
> Attachments: HIVE-11802.3.patch
>
>
> When inserting float-point numbers to a table, the values displayed on 
> beeline or jdbc are with different precision.
> How to reproduce:
> {noformat}
> 0: jdbc:hive2://localhost:1> create table decimals (f float, af 
> array, d double, ad array) stored as parquet;
> No rows affected (0.294 seconds)
> 0: jdbc:hive2://localhost:1> insert into table decimals select 1.10058, 
> array(cast(1.10058 as float)), 2.0133, array(2.0133) from dummy limit 1;
> ...
> No rows affected (20.089 seconds)
> 0: jdbc:hive2://localhost:1> select f, af, af[0], d, ad[0] from decimals;
> +-++-+-+-+--+
> |  f  | af | _c2 |d|   _c4   |
> +-++-+-+-+--+
> | 1.1005799770355225  | [1.10058]  | 1.1005799770355225  | 2.0133  | 2.0133  |
> +-++-+-+-+--+
> {noformat}
> When displaying arrays, the values are displayed correctly, but if I print a 
> specific element, it is then displayed with more decimal positions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12082) Null comparison for greatest and least operator

2015-10-09 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951403#comment-14951403
 ] 

Szehon Ho commented on HIVE-12082:
--

Yea that's what I found too.  

It looks like Hive is already good in terms of comparison operators (like '>'), 
just broken from 'Greatest' and 'Least'

{noformat}
hive> SELECT 10Y > null
> FROM test
> ;
OK
NULL

hive> select greatest(null, 1) from test;
OK
1
{noformat}

I'll try to refactor the greatest/least operator class to use the same logics 
as those, only difference is its a multi-way comparison.

> Null comparison for greatest and least operator
> ---
>
> Key: HIVE-12082
> URL: https://issues.apache.org/jira/browse/HIVE-12082
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
>  and 
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +---+
> | greatest(1, null) |
> +---+
> |  NULL |
> +---+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> ++
> | greatest(-1, null) |
> ++
> |   NULL |
> ++
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null does not win in greatest, least over 
> values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12025) refactor bucketId generating code

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951420#comment-14951420
 ] 

Hive QA commented on HIVE-12025:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765658/HIVE-12025.3.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9645 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-update_orig_table.q-vectorization_13.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5588/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5588/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5588/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765658 - PreCommit-HIVE-TRUNK-Build

> refactor bucketId generating code
> -
>
> Key: HIVE-12025
> URL: https://issues.apache.org/jira/browse/HIVE-12025
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12025.2.patch, HIVE-12025.3.patch, HIVE-12025.patch
>
>
> HIVE-11983 adds ObjectInspectorUtils.getBucketHashCode() and 
> getBucketNumber().
> There are several (at least) places in Hive that perform this computation:
> # ReduceSinkOperator.computeBucketNumber
> # ReduceSinkOperator.computeHashCode
> # BucketIdResolverImpl - only in 2.0.0 ASF line
> # FileSinkOperator.findWriterOffset
> # GenericUDFHash
> Should refactor it and make sure they all call methods from 
> ObjectInspectorUtils.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12074) Conditionally turn off hybrid grace hash join based on est. data size, etc

2015-10-09 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12074:
-
Attachment: HIVE-12074.1.patch

> Conditionally turn off hybrid grace hash join based on est. data size, etc
> --
>
> Key: HIVE-12074
> URL: https://issues.apache.org/jira/browse/HIVE-12074
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12074.1.patch
>
>
> Currently, as long as the below flag is set to true, we always do grace hash 
> join for map join. This may not be necessary, esp. for cases where the data 
> size is quite small, and number of distinct values is also small.
> hive.mapjoin.hybridgrace.hashtable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12084) Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java heap space

2015-10-09 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12084:
-
Attachment: HIVE-12084.1.patch

> Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java 
> heap space
> --
>
> Key: HIVE-12084
> URL: https://issues.apache.org/jira/browse/HIVE-12084
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12084.1.patch
>
>
> STEPS TO REPRODUCE:
> {code}
> CREATE TABLE `sample_07` ( `code` string , `description` string , `total_emp` 
> int , `salary` int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS 
> TextFile;
> load data local inpath 'sample_07.csv'  into table sample_07;
> set hive.limit.pushdown.memory.usage=0.;
> select * from sample_07 order by salary LIMIT 9;
> {code}
> This will result in 
> {code}
> Caused by: java.lang.OutOfMemoryError: Java heap space
>   at org.apache.hadoop.hive.ql.exec.TopNHash.initialize(TopNHash.java:113)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:234)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:68)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
> {code}
> The basic issue lies with top n optimization. We need a limit for the top n 
> optimization. Ideally we would detect that the allocated bytes will be bigger 
> than the "limit.pushdown.memory.usage" without trying to alloc it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11856) allow split strategies to run on threadpool

2015-10-09 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951439#comment-14951439
 ] 

Vikram Dixit K commented on HIVE-11856:
---

+1 LGTM.

> allow split strategies to run on threadpool
> ---
>
> Key: HIVE-11856
> URL: https://issues.apache.org/jira/browse/HIVE-11856
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11856.01.patch, HIVE-11856.patch
>
>
> If a split strategy makes metastore cache calls, it should probably run on 
> the threadpool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9490) [Parquet] Support Alter Table/Partition Concatenate

2015-10-09 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951442#comment-14951442
 ] 

Ryan Blue commented on HIVE-9490:
-

There's a patch available on PARQUET-382 that implements this for Parquet. Hive 
would just need to take advantage of that.

> [Parquet] Support Alter Table/Partition Concatenate
> ---
>
> Key: HIVE-9490
> URL: https://issues.apache.org/jira/browse/HIVE-9490
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Dong Chen
>Assignee: Dong Chen
> Attachments: HIVE-9490.patch-testcase
>
>
> Parquet should support 
> {{ALTER TABLE table_name \[PARTITION (partition_key = 'partition_value')\] 
> CONCATENATE;}}
> If the table or partition contains many small Parquet files, then the above 
> command will merge them into larger files. The merge should happen at row 
> group level thereby avoiding the overhead of decompressing and decoding the 
> data. 
> It is only supported by RCFiles or ORCFiles now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12064) prevent transactional=false

2015-10-09 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12064:
--
Attachment: HIVE-12064.2.patch

> prevent transactional=false
> ---
>
> Key: HIVE-12064
> URL: https://issues.apache.org/jira/browse/HIVE-12064
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12064.2.patch, HIVE-12064.patch
>
>
> currently a tblproperty transactional=true must be set to make a table behave 
> in ACID compliant way.
> This is misleading in that it seems like changing it to transactional=false 
> makes the table non-acid but on disk layout of acid table is different than 
> plain tables.  So changing this  property may cause wrong data to be returned.
> Should prevent transactional=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12039) Fix TestSSL#testSSLVersion

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951492#comment-14951492
 ] 

Hive QA commented on HIVE-12039:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765663/HIVE-12039.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9659 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5589/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5589/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5589/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765663 - PreCommit-HIVE-TRUNK-Build

> Fix TestSSL#testSSLVersion 
> ---
>
> Key: HIVE-12039
> URL: https://issues.apache.org/jira/browse/HIVE-12039
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12039.1.patch
>
>
> Looks like it's only run on Linux and failing after HIVE-11720.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12066) Add javadoc for methods added to public APIs

2015-10-09 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950048#comment-14950048
 ] 

Lefty Leverenz commented on HIVE-12066:
---

Shouldn't the patch be attached to this issue?  (Commit 
cf76e6b5d89dfa2d44b96c852a10ccd0252a4fe8.)

> Add javadoc for methods added to public APIs
> 
>
> Key: HIVE-12066
> URL: https://issues.apache.org/jira/browse/HIVE-12066
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>
> Looking through the changes for ORC, there are methods being added without 
> documentation:
> {code}
> --- ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java
> @@ -360,8 +353,18 @@ RecordReader rows(long offset, long length,
>MetadataReader metadata() throws IOException;
> +  List getVersionList();
> +
> +  int getMetadataSize();
> +
> +  List getOrcProtoStripeStatistics();
> +
> +  List getStripeStatistics();
> +
> +  List getOrcProtoFileStatistics();
> +
> +  DataReader createDefaultDataReader(boolean useZeroCopy);
> +
> {code}
> You really need to look through all of the interfaces and fix them before 
> merging into master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2015-10-09 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11981:

Attachment: HIVE-11981.01.patch

> ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
> --
>
> Key: HIVE-11981
> URL: https://issues.apache.org/jira/browse/HIVE-11981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11981.01.patch, ORC Schema Evolution Issues.docx
>
>
> High priority issues with schema evolution for the ORC file format.
> Schema evolution here is limited to adding new columns and a few cases of 
> column type-widening (e.g. int to bigint).
> Renaming columns, deleting column, moving columns and other schema evolution 
> were not pursued due to lack of importance and lack of time.  Also, it 
> appears a much more sophisticated metadata would be needed to support them.
> The biggest issues for users have been adding new columns for ACID table 
> (HIVE-11421 Support Schema evolution for ACID tables) and vectorization 
> (HIVE-10598 Vectorization borks when column is added to table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2015-10-09 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11981:

Attachment: (was: HIVE-11981.01.patch)

> ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
> --
>
> Key: HIVE-11981
> URL: https://issues.apache.org/jira/browse/HIVE-11981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11981.01.patch, ORC Schema Evolution Issues.docx
>
>
> High priority issues with schema evolution for the ORC file format.
> Schema evolution here is limited to adding new columns and a few cases of 
> column type-widening (e.g. int to bigint).
> Renaming columns, deleting column, moving columns and other schema evolution 
> were not pursued due to lack of importance and lack of time.  Also, it 
> appears a much more sophisticated metadata would be needed to support them.
> The biggest issues for users have been adding new columns for ACID table 
> (HIVE-11421 Support Schema evolution for ACID tables) and vectorization 
> (HIVE-10598 Vectorization borks when column is added to table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11894) CBO: Calcite Operator To Hive Operator (Calcite Return Path): correct table column name in CTAS queries

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949984#comment-14949984
 ] 

Hive QA commented on HIVE-11894:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765488/HIVE-11894.05.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9657 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_lineage2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_outer_join_ppr
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5579/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5579/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5579/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765488 - PreCommit-HIVE-TRUNK-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): correct table 
> column name in CTAS queries
> ---
>
> Key: HIVE-11894
> URL: https://issues.apache.org/jira/browse/HIVE-11894
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11894.01.patch, HIVE-11894.02.patch, 
> HIVE-11894.03.patch, HIVE-11894.04.patch, HIVE-11894.05.patch
>
>
> To repro, run lineage2.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11931) Join sql cannot get result

2015-10-09 Thread Xiaomeng Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950052#comment-14950052
 ] 

Xiaomeng Huang commented on HIVE-11931:
---

Hi [~iceberg565]
I think this bug has fixed in https://issues.apache.org/jira/browse/HIVE-10996

> Join sql cannot get result
> --
>
> Key: HIVE-11931
> URL: https://issues.apache.org/jira/browse/HIVE-11931
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Query Planning, Query Processor, SQL
>Affects Versions: 1.1.1, 1.2.1
>Reporter: NING DING
> Attachments: 00_0
>
>
> I found a join issue in hive-1.2.1 and hive-1.1.1.
> The create table sql is as below.
> {code}
> CREATE TABLE IF NOT EXISTS join_case(
> orderid  bigint,
> tradeitemid bigint,
> id bigint
> ) ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ',' 
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE;
> {code}
> Please put attached sample data file 00_0 in /tmp/join_case folder.
> Then load data.
> {code}
> LOAD DATA LOCAL INPATH '/tmp/join_case/00_0' OVERWRITE INTO TABLE 
> join_case;
> {code}
> Run the following sql, but cannot get searching result.
> {code}
> select a.id from 
> (
> select orderid as orderid, max(id) as id from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> This issue also occurs in hive-1.1.0-cdh5.4.5.
> But in apache hive-1.0.1 the above sql can return 10 rows.
> After exchanging the sequence of "orderid as orderid" and "max(id) as id", 
> the following sql can get result in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select max(id) as id, orderid as orderid from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> Also, the following sql can get results in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select orderid as orderid, id as id from join_case group by orderid, id
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10; 
> {code}
> Anyone can take a look at this issue? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12065) FS stats collection may generate incorrect stats for multi-insert query

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949992#comment-14949992
 ] 

Hive QA commented on HIVE-12065:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765494/HIVE-12065.patch

{color:green}SUCCESS:{color} +1 due to 10 test(s) being added or modified.

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5580/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5580/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5580/

Messages:
{noformat}
 This message was trimmed, see log for full details 

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ant/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ant/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ant/target/tmp/conf
 [copy] Copying 10 files to 
/data/hive-ptest/working/apache-github-source-source/ant/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-ant ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-ant ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-ant ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/ant/target/hive-ant-2.0.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-ant ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-ant ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/ant/target/hive-ant-2.0.0-SNAPSHOT.jar
 to 
/home/hiveptest/.m2/repository/org/apache/hive/hive-ant/2.0.0-SNAPSHOT/hive-ant-2.0.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/ant/pom.xml to 
/home/hiveptest/.m2/repository/org/apache/hive/hive-ant/2.0.0-SNAPSHOT/hive-ant-2.0.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Spark Remote Client 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ spark-client ---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/spark-client/target
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/spark-client (includes = 
[datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
spark-client ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ spark-client 
---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ spark-client ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ spark-client 
---
[INFO] Compiling 28 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java:
 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
 uses or overrides a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java:
 Recompile with -Xlint:deprecation for details.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Some input files use unchecked or unsafe operations.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client ---
[INFO] 

[jira] [Commented] (HIVE-12082) Null comparison for greatest and least operator

2015-10-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951288#comment-14951288
 ] 

Sergey Shelukhin commented on HIVE-12082:
-

compare predicates in ANSI SQL do evaluate to UNKNOWN, which is presumably 
NULL. I wonder though if Hive evaluates the 3-value logic after that 
correctly... anyway makes sense to change those to null

> Null comparison for greatest and least operator
> ---
>
> Key: HIVE-12082
> URL: https://issues.apache.org/jira/browse/HIVE-12082
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
>  and 
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +---+
> | greatest(1, null) |
> +---+
> |  NULL |
> +---+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> ++
> | greatest(-1, null) |
> ++
> |   NULL |
> ++
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null does not win in greatest, least over 
> values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12081) LLAP: Make explainuser_1.q test consistent

2015-10-09 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12081:
-
Attachment: HIVE-12081.1.patch

> LLAP: Make explainuser_1.q test consistent
> --
>
> Key: HIVE-12081
> URL: https://issues.apache.org/jira/browse/HIVE-12081
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12081.1.patch
>
>
> explainuser_1.q does not produce consistent results in llap vs tez. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12062) enable HBase metastore file metadata cache for tez tests

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951186#comment-14951186
 ] 

Hive QA commented on HIVE-12062:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765491/HIVE-12062.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 9357 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-auto_join30.q-vector_data_types.q-filter_join_breaktask.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_13.q-tez_self_join.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_bmj_schema_evolution.q-orc_merge5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-enforce_order.q-constprog_dpp.q-auto_join1.q-and-12-more - 
did not produce a TEST-*.xml file
TestMiniTezCliDriver-orc_merge6.q-vector_outer_join0.q-mapreduce1.q-and-12-more 
- did not produce a TEST-*.xml file
TestMiniTezCliDriver-script_pipe.q-mapjoin_decimal.q-transform_ppr2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-tez_joins_explain.q-join1.q-bucket_map_join_tez1.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-unionDistinct_1.q-insert_values_non_partitioned.q-insert_update_delete.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_after_multiple_inserts.q-update_all_partitioned.q-vectorized_rcfile_columnar.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-vectorization_13.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_coalesce.q-auto_sortmerge_join_7.q-dynamic_partition_pruning.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_decimal_10_0.q-vector_acid3.q-vector_decimal_trailing.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_decimal_round.q-cbo_windowing.q-tez_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_distinct_2.q-vector_interval_2.q-load_dyn_part2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_groupby_3.q - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_grouping_sets.q-scriptfile1.q-union2.q-and-12-more 
- did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_left_outer_join2.q-vector_outer_join5.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_partition_diff_num_cols.q-vectorization_10.q-orc_merge9.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_partitioned_date_time.q-vector_non_string_partition.q-tez_union.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorization_16.q-mapjoin_mapjoin.q-groupby2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorized_parquet.q-vector_char_mapjoin1.q-tez_insert_overwrite_local_directory_1.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5587/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5587/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5587/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765491 - PreCommit-HIVE-TRUNK-Build

> enable HBase metastore file metadata cache for tez tests
> 
>
> Key: HIVE-12062
> URL: https://issues.apache.org/jira/browse/HIVE-12062
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12062.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: HIVE-11642.24.patch

Incorporating HIVE-12081

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.23.patch, HIVE-11642.24.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-09 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-12083:

Description: 
In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
AggrStats object to be returned if partNames is empty or colNames is empty:

{code}
diff --git 
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
index 0a56bac..ed810d2 100644
--- metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
+++ metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
@@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
   public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
   List partNames, List colNames, boolean 
useDensityFunctionForNDVEstimation)
   throws MetaException {
+if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); // 
Nothing to aggregate.
 long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
colNames);
 List colStatsList;
 // Try to read from the cache first
{code}

This runs afoul of thrift requirements that AggrStats have required fields:

{code}
struct AggrStats {
1: required list colStats,
2: required i64 partsFound // number of partitions for which stats were found
}
{code}

Thus, we get errors as follows:

{noformat}
2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
(TThreadPoolServer.java:run(213)) - Thrift error occurred during processing of 
message.
org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
unset! Struct:AggrStats(colStats:null, partsFound:0)
at 
org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}

Normally, this would not occur since HIVE-10965 does also include a guard on 
the client-side for colNames.isEmpty() to not call the metastore call at all, 
but there is no guard for partNames being empty, and would still cause an error 
on the metastore side if the thrift call were called directly, as would happen 
if the client is from an older version before this was patched.

  was:
In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
AggrStats object to be returned if partNames is empty or colNames is empty:

{code}
diff --git 
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
index 0a56bac..ed810d2 100644
--- metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
+++ metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
@@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
   public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
   List partNames, List colNames, boolean 
useDensityFunctionForNDVEstimation)
   throws MetaException {
+if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); // 
Nothing to aggregate.
 long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
colNames);
 List colStatsList;
 // Try to read from the cache first
{code}

This runs afoul of thrift requirements that AggrStats have 

[jira] [Updated] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-09 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-12083:

Component/s: Metastore

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not occur since HIVE-10965 does also include a guard on 
> the client-side for colNames.isEmpty() to not call the metastore call at all, 
> but there is no guard for partNames being empty, and would still cause an 
> error on the metastore side if the thrift call were called directly, as would 
> happen if the client is from an older version before this was patched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12076) WebHCat listing jobs after the given JobId even when templeton.jobs.listorder is set to lexicographicaldesc

2015-10-09 Thread Kiran Kumar Kolli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kiran Kumar Kolli updated HIVE-12076:
-
Attachment: HIVE-12076.2.patch

Re-created the patch by pulling latest changes.

> WebHCat listing jobs after the given JobId even when templeton.jobs.listorder 
> is set to lexicographicaldesc
> ---
>
> Key: HIVE-12076
> URL: https://issues.apache.org/jira/browse/HIVE-12076
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Kiran Kumar Kolli
>Assignee: Kiran Kumar Kolli
> Fix For: 0.14.0
>
> Attachments: HIVE-12076.1.patch, HIVE-12076.2.patch
>
>
> HIVE-11724 introduced new setting to change the order of jobs listed. 
> In-cases where "templeton.jobs.listorder" is set to lexicographicaldesc, 
> filtering based on jobid still returning values greater then given job if, 
> where as less than are expected. Its a code bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-09 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-12083:

Affects Version/s: 1.0.2
   1.2.1

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not occur since HIVE-10965 does also include a guard on 
> the client-side for colNames.isEmpty() to not call the metastore call at all, 
> but there is no guard for partNames being empty, and would still cause an 
> error on the metastore side if the thrift call were called directly, as would 
> happen if the client is from an older version before this was patched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12081) LLAP: Make explainuser_1.q test consistent

2015-10-09 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951123#comment-14951123
 ] 

Prasanth Jayachandran commented on HIVE-12081:
--

[~sershe] fyi.. This should fix the inconsistency with stats. Java version 
related inconsistency is still there. Make sure to use java 7 to verify this 
patch.

> LLAP: Make explainuser_1.q test consistent
> --
>
> Key: HIVE-12081
> URL: https://issues.apache.org/jira/browse/HIVE-12081
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
> Attachments: HIVE-12081.1.patch
>
>
> explainuser_1.q does not produce consistent results in llap vs tez. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12082) Null comparison for greatest and least operator

2015-10-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951231#comment-14951231
 ] 

Sergey Shelukhin commented on HIVE-12082:
-

Does ANSI SQL specify how this should behave?

> Null comparison for greatest and least operator
> ---
>
> Key: HIVE-12082
> URL: https://issues.apache.org/jira/browse/HIVE-12082
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Szehon Ho
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
>  and 
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +---+
> | greatest(1, null) |
> +---+
> |  NULL |
> +---+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> ++
> | greatest(-1, null) |
> ++
> |   NULL |
> ++
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null does not win in greatest, least over 
> values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12082) Null comparison for greatest and least operator

2015-10-09 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-12082:


Assignee: Szehon Ho

> Null comparison for greatest and least operator
> ---
>
> Key: HIVE-12082
> URL: https://issues.apache.org/jira/browse/HIVE-12082
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
>  and 
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +---+
> | greatest(1, null) |
> +---+
> |  NULL |
> +---+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> ++
> | greatest(-1, null) |
> ++
> |   NULL |
> ++
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null does not win in greatest, least over 
> values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12082) Null comparison for greatest and least operator

2015-10-09 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-12082:
-
Component/s: UDF

> Null comparison for greatest and least operator
> ---
>
> Key: HIVE-12082
> URL: https://issues.apache.org/jira/browse/HIVE-12082
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
>  and 
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +---+
> | greatest(1, null) |
> +---+
> |  NULL |
> +---+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> ++
> | greatest(-1, null) |
> ++
> |   NULL |
> ++
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null does not win in greatest, least over 
> values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-09 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951125#comment-14951125
 ] 

Prasanth Jayachandran commented on HIVE-11642:
--

HIVE-12081 should fix inconsistency with explainuser_1.q test.

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.23.patch, HIVE-11642.23.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12082) Null comparison for greatest and least operator

2015-10-09 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951232#comment-14951232
 ] 

Szehon Ho commented on HIVE-12082:
--

I took a look, it does not specify 'greatest' functions in the SQL standard as 
far as I can find.

But I think it makes more sense, as NULL means unknown, so if any argument is 
unknown then result is unknown.  What do you think?

> Null comparison for greatest and least operator
> ---
>
> Key: HIVE-12082
> URL: https://issues.apache.org/jira/browse/HIVE-12082
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
>  and 
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +---+
> | greatest(1, null) |
> +---+
> |  NULL |
> +---+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> ++
> | greatest(-1, null) |
> ++
> |   NULL |
> ++
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null does not win in greatest, least over 
> values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951573#comment-14951573
 ] 

Hive QA commented on HIVE-11642:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765910/HIVE-11642.24.patch

{color:green}SUCCESS:{color} +1 due to 47 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9738 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-auto_join30.q-vector_data_types.q-filter_join_breaktask.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5590/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5590/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5590/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765910 - PreCommit-HIVE-TRUNK-Build

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.23.patch, HIVE-11642.24.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11894) CBO: Calcite Operator To Hive Operator (Calcite Return Path): correct table column name in CTAS queries

2015-10-09 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11894:
---
Attachment: HIVE-11894.06.patch

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): correct table 
> column name in CTAS queries
> ---
>
> Key: HIVE-11894
> URL: https://issues.apache.org/jira/browse/HIVE-11894
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11894.01.patch, HIVE-11894.02.patch, 
> HIVE-11894.03.patch, HIVE-11894.04.patch, HIVE-11894.05.patch, 
> HIVE-11894.06.patch
>
>
> To repro, run lineage2.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951614#comment-14951614
 ] 

Hive QA commented on HIVE-11914:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765672/HIVE-11914.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9660 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5591/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5591/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5591/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765672 - PreCommit-HIVE-TRUNK-Build

> When transactions gets a heartbeat, it doesn't update the lock heartbeat.
> -
>
> Key: HIVE-11914
> URL: https://issues.apache.org/jira/browse/HIVE-11914
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11914.2.patch, HIVE-11914.3.patch, HIVE-11914.patch
>
>
> TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the 
> associated locks.  This makes SHOW LOCKS confusing/misleading.
> This is especially visible in Streaming API use cases which use
> TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: (was: HIVE-11642.23.patch)

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.23.patch, HIVE-11642.24.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12065) FS stats collection may generate incorrect stats for multi-insert query

2015-10-09 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12065:

Attachment: HIVE-12065.2.patch

> FS stats collection may generate incorrect stats for multi-insert query
> ---
>
> Key: HIVE-12065
> URL: https://issues.apache.org/jira/browse/HIVE-12065
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12065.2.patch, HIVE-12065.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11785) Support escaping carriage return and new line for LazySimpleSerDe

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950636#comment-14950636
 ] 

Hive QA commented on HIVE-11785:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765613/HIVE-11785.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9642 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-script_pipe.q-mapjoin_decimal.q-transform_ppr2.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5585/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5585/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5585/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765613 - PreCommit-HIVE-TRUNK-Build

> Support escaping carriage return and new line for LazySimpleSerDe
> -
>
> Key: HIVE-11785
> URL: https://issues.apache.org/jira/browse/HIVE-11785
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.0.0
>
> Attachments: HIVE-11785.2.patch, HIVE-11785.3.patch, 
> HIVE-11785.patch, test.parquet
>
>
> Create the table and perform the queries as follows. You will see different 
> results when the setting changes. 
> The expected result should be:
> {noformat}
> 1 newline
> here
> 2 carriage return
> 3 both
> here
> {noformat}
> {noformat}
> hive> create table repo (lvalue int, charstring string) stored as parquet;
> OK
> Time taken: 0.34 seconds
> hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
> Loading data to table default.repo
> chgrp: changing ownership of 
> 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not 
> belong to hive
> Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, 
> rawDataSize=0]
> OK
> Time taken: 0.732 seconds
> hive> set hive.fetch.task.conversion=more;
> hive> select * from repo;
> OK
> 1 newline
> here
> here  carriage return
> 3 both
> here
> Time taken: 0.253 seconds, Fetched: 3 row(s)
> hive> set hive.fetch.task.conversion=none;
> hive> select * from repo;
> Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441752031022_0006, Tracking URL = 
> http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/
> Kill Command = 
> /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job  
> -kill job_1441752031022_0006
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-09-09 11:35:54,127 Stage-1 map = 0%,  reduce = 0%
> 2015-09-09 11:36:04,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.98 
> sec
> MapReduce Total cumulative CPU time: 2 seconds 980 msec
> Ended Job = job_1441752031022_0006
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.98 sec   HDFS Read: 4251 HDFS 
> Write: 51 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 980 msec
> OK
> 1 newline
> NULL  NULL
> 2 carriage return
> NULL  NULL
> 3 both
> NULL  NULL
> Time taken: 25.131 seconds, Fetched: 6 row(s)
> hive>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12046) Re-create spark client if connection is dropped

2015-10-09 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950635#comment-14950635
 ] 

Xuefu Zhang commented on HIVE-12046:


+1 to the latest patch.

> Re-create spark client if connection is dropped
> ---
>
> Key: HIVE-12046
> URL: https://issues.apache.org/jira/browse/HIVE-12046
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12046.1.patch, HIVE-12046.2.patch
>
>
> Currently, if the connection to the spark cluster is dropped, the spark 
> client will stay in a bad state. A new Hive session is needed to re-establish 
> the connection. It is better to auto reconnect in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11499) Datanucleus leaks classloaders when used using embedded metastore with HiveServer2 with UDFs

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950259#comment-14950259
 ] 

Hive QA commented on HIVE-11499:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765686/HIVE-11499.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9657 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5583/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5583/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5583/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765686 - PreCommit-HIVE-TRUNK-Build

> Datanucleus leaks classloaders when used using embedded metastore with 
> HiveServer2 with UDFs
> 
>
> Key: HIVE-11499
> URL: https://issues.apache.org/jira/browse/HIVE-11499
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.1.1, 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11499.1.patch, HIVE-11499.3.patch, 
> HIVE-11499.4.patch, HS2-NucleusCache-Leak.tiff
>
>
> When UDFs are used, we create a new classloader to add the UDF jar. Similar 
> to what hadoop's reflection utils does(HIVE-11408), datanucleus caches the 
> classloaders 
> (https://github.com/datanucleus/datanucleus-core/blob/3.2/src/java/org/datanucleus/NucleusContext.java#L161).
>  JDOPersistanceManager factory (1 per JVM) holds on to a NucleusContext 
> reference 
> (https://github.com/datanucleus/datanucleus-api-jdo/blob/3.2/src/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L115).
>  Until we call  NucleusContext#close, the classloader cache is not cleared. 
> In case of UDFs this can lead to permgen leak, as shown in the attached 
> screenshot, where NucleusContext holds on to several URLClassloader objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2015-10-09 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11981:

Attachment: (was: HIVE-11981.01.patch)

> ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
> --
>
> Key: HIVE-11981
> URL: https://issues.apache.org/jira/browse/HIVE-11981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11981.01.patch, ORC Schema Evolution Issues.docx
>
>
> High priority issues with schema evolution for the ORC file format.
> Schema evolution here is limited to adding new columns and a few cases of 
> column type-widening (e.g. int to bigint).
> Renaming columns, deleting column, moving columns and other schema evolution 
> were not pursued due to lack of importance and lack of time.  Also, it 
> appears a much more sophisticated metadata would be needed to support them.
> The biggest issues for users have been adding new columns for ACID table 
> (HIVE-11421 Support Schema evolution for ACID tables) and vectorization 
> (HIVE-10598 Vectorization borks when column is added to table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm

2015-10-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950362#comment-14950362
 ] 

Hive QA commented on HIVE-11954:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765579/HIVE-11954.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9650 tests executed
*Failed tests:*
{noformat}
TestSparkNegativeCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5584/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5584/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5584/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765579 - PreCommit-HIVE-TRUNK-Build

> Extend logic to choose side table in MapJoin Conversion algorithm
> -
>
> Key: HIVE-11954
> URL: https://issues.apache.org/jira/browse/HIVE-11954
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11954.01.patch, HIVE-11954.02.patch, 
> HIVE-11954.03.patch, HIVE-11954.04.patch, HIVE-11954.patch, HIVE-11954.patch
>
>
> Selection of side table (in memory/hash table) in MapJoin Conversion 
> algorithm needs to be more sophisticated.
> In an N way Map Join, Hive should pick an input stream as side table (in 
> memory table) that has least cost in producing relation (like TS(FIL|Proj)*).
> Cost based choice needs extended cost model; without return path its going to 
> be hard to do this.
> For the time being we could employ a modified cost based algorithm for side 
> table selection.
> New algorithm is described below:
> 1. Identify the candidate set of inputs for side table (in memory/hash table) 
> from the inputs (based on conditional task size)
> 2. For each of the input identify its cost, memory requirement. Cost is 1 for 
> each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for 
> an input is the total no of heavy weight ops in its branch.
> 3. Order set from #1 on cost & memory req (ascending order)
> 4. Pick the first element from #3 as the side table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11822) vectorize NVL UDF

2015-10-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950890#comment-14950890
 ] 

Sergey Shelukhin commented on HIVE-11822:
-

[~tasanuma0829] looks like you need to update vector_coalesce output for 
minitez. You can run it the same way as regular but TestMiniTezCliDriver is the 
test name

> vectorize NVL UDF
> -
>
> Key: HIVE-11822
> URL: https://issues.apache.org/jira/browse/HIVE-11822
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
> Attachments: HIVE-11822.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12048) metastore file metadata cache should not be used when deltas are present

2015-10-09 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950911#comment-14950911
 ] 

Prasanth Jayachandran commented on HIVE-12048:
--

+1

> metastore file metadata cache should not be used when deltas are present
> 
>
> Key: HIVE-12048
> URL: https://issues.apache.org/jira/browse/HIVE-12048
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12048.patch
>
>
> Previous code doesn't check for deltas before getting footers from local 
> cache even though stripe filtering with deltas is not possible; this is 
> because checking local cache is cheap I guess. Make sure we check early for 
> metastore-based cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: HIVE-11642.23.patch

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.23.patch, HIVE-11642.23.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12060) LLAP: create separate variable for llap tests

2015-10-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950924#comment-14950924
 ] 

Sergey Shelukhin commented on HIVE-12060:
-

I'll file a jira to do all this

> LLAP: create separate variable for llap tests
> -
>
> Key: HIVE-12060
> URL: https://issues.apache.org/jira/browse/HIVE-12060
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12060.23.patch
>
>
> No real reason to just reuse tez one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11985) don't store type names in metastore when metastore type names are not used

2015-10-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950976#comment-14950976
 ] 

Sergey Shelukhin commented on HIVE-11985:
-

[~xuefuz] does the updated patch address your concern? :)

> don't store type names in metastore when metastore type names are not used
> --
>
> Key: HIVE-11985
> URL: https://issues.apache.org/jira/browse/HIVE-11985
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11985.01.patch, HIVE-11985.02.patch, 
> HIVE-11985.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12060) LLAP: create separate variable for llap tests

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12060:

Attachment: (was: HIVE-12060.23.patch)

> LLAP: create separate variable for llap tests
> -
>
> Key: HIVE-12060
> URL: https://issues.apache.org/jira/browse/HIVE-12060
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12060.01.patch
>
>
> No real reason to just reuse tez one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12060) LLAP: create separate variable for llap tests

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12060:

Attachment: HIVE-12060.01.patch

Another attempt with separate variable. I cannot understand why stats in 
explainuser1 keep getting spurious diffs.

> LLAP: create separate variable for llap tests
> -
>
> Key: HIVE-12060
> URL: https://issues.apache.org/jira/browse/HIVE-12060
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12060.01.patch
>
>
> No real reason to just reuse tez one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12052) automatically populate file metadata to HBase metastore based on config or table properties

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12052:

Assignee: (was: Sergey Shelukhin)

> automatically populate file metadata to HBase metastore based on config or 
> table properties
> ---
>
> Key: HIVE-12052
> URL: https://issues.apache.org/jira/browse/HIVE-12052
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> As discussed in HIVE-11500



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12052) automatically populate file metadata to HBase metastore based on config or table properties

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12052:

Description: 
As discussed in HIVE-11500
Should use a table property similar to auto.purge.
Then, when this setting is set, partitions are added, after compactions, after 
load/non-ACID insert, and periodically (configurable), the storage locations 
should be scanned for new files and cache updated accordingly. All the updates 
should probably be in the background thread and taken from queue (high pri from 
ops, low pri from periodic updates) to avoid high load on HDFS from metastore.

  was:As discussed in HIVE-11500


> automatically populate file metadata to HBase metastore based on config or 
> table properties
> ---
>
> Key: HIVE-12052
> URL: https://issues.apache.org/jira/browse/HIVE-12052
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> As discussed in HIVE-11500
> Should use a table property similar to auto.purge.
> Then, when this setting is set, partitions are added, after compactions, 
> after load/non-ACID insert, and periodically (configurable), the storage 
> locations should be scanned for new files and cache updated accordingly. All 
> the updates should probably be in the background thread and taken from queue 
> (high pri from ops, low pri from periodic updates) to avoid high load on HDFS 
> from metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12052) automatically populate file metadata to HBase metastore based on config or table properties

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12052:

Description: 
As discussed in HIVE-11500
Should use a table property similar to auto.purge.
Then, when this setting is set, partitions are added, after compactions, after 
load/non-ACID insert, and periodically (configurable), the storage locations 
should be scanned for new files and cache updated accordingly. All the updates 
should probably be in the background thread and taken from queue (high pri from 
most ops, low pri from enabling the property and from periodic updates) to 
avoid high load on HDFS from metastore.

  was:
As discussed in HIVE-11500
Should use a table property similar to auto.purge.
Then, when this setting is set, partitions are added, after compactions, after 
load/non-ACID insert, and periodically (configurable), the storage locations 
should be scanned for new files and cache updated accordingly. All the updates 
should probably be in the background thread and taken from queue (high pri from 
ops, low pri from periodic updates) to avoid high load on HDFS from metastore.


> automatically populate file metadata to HBase metastore based on config or 
> table properties
> ---
>
> Key: HIVE-12052
> URL: https://issues.apache.org/jira/browse/HIVE-12052
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> As discussed in HIVE-11500
> Should use a table property similar to auto.purge.
> Then, when this setting is set, partitions are added, after compactions, 
> after load/non-ACID insert, and periodically (configurable), the storage 
> locations should be scanned for new files and cache updated accordingly. All 
> the updates should probably be in the background thread and taken from queue 
> (high pri from most ops, low pri from enabling the property and from periodic 
> updates) to avoid high load on HDFS from metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12052) automatically populate file metadata to HBase metastore based on config or table properties

2015-10-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12052:

Description: 
As discussed in HIVE-11500
Should use a table property similar to auto.purge.
Then, when this setting is set, partitions are added (convertToMPart is a good 
source to find all the paths for that), after compactions, after load/non-ACID 
insert, and periodically (configurable), the storage locations should be 
scanned for new files and cache updated accordingly. All the updates should 
probably be in the background thread and taken from queue (high pri from most 
ops, low pri from enabling the property and from periodic updates) to avoid 
high load on HDFS from metastore.

  was:
As discussed in HIVE-11500
Should use a table property similar to auto.purge.
Then, when this setting is set, partitions are added, after compactions, after 
load/non-ACID insert, and periodically (configurable), the storage locations 
should be scanned for new files and cache updated accordingly. All the updates 
should probably be in the background thread and taken from queue (high pri from 
most ops, low pri from enabling the property and from periodic updates) to 
avoid high load on HDFS from metastore.


> automatically populate file metadata to HBase metastore based on config or 
> table properties
> ---
>
> Key: HIVE-12052
> URL: https://issues.apache.org/jira/browse/HIVE-12052
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> As discussed in HIVE-11500
> Should use a table property similar to auto.purge.
> Then, when this setting is set, partitions are added (convertToMPart is a 
> good source to find all the paths for that), after compactions, after 
> load/non-ACID insert, and periodically (configurable), the storage locations 
> should be scanned for new files and cache updated accordingly. All the 
> updates should probably be in the background thread and taken from queue 
> (high pri from most ops, low pri from enabling the property and from periodic 
> updates) to avoid high load on HDFS from metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm

2015-10-09 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11954:
---
Attachment: HIVE-11954.05.patch

> Extend logic to choose side table in MapJoin Conversion algorithm
> -
>
> Key: HIVE-11954
> URL: https://issues.apache.org/jira/browse/HIVE-11954
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11954.01.patch, HIVE-11954.02.patch, 
> HIVE-11954.03.patch, HIVE-11954.04.patch, HIVE-11954.05.patch, 
> HIVE-11954.patch, HIVE-11954.patch
>
>
> Selection of side table (in memory/hash table) in MapJoin Conversion 
> algorithm needs to be more sophisticated.
> In an N way Map Join, Hive should pick an input stream as side table (in 
> memory table) that has least cost in producing relation (like TS(FIL|Proj)*).
> Cost based choice needs extended cost model; without return path its going to 
> be hard to do this.
> For the time being we could employ a modified cost based algorithm for side 
> table selection.
> New algorithm is described below:
> 1. Identify the candidate set of inputs for side table (in memory/hash table) 
> from the inputs (based on conditional task size)
> 2. For each of the input identify its cost, memory requirement. Cost is 1 for 
> each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for 
> an input is the total no of heavy weight ops in its branch.
> 3. Order set from #1 on cost & memory req (ascending order)
> 4. Pick the first element from #3 as the side table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)