[jira] [Updated] (HIVE-8458) Potential null dereference in Utilities#clearWork()

2015-04-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8458:
-
Description: 
{code}
Path mapPath = getPlanPath(conf, MAP_PLAN_NAME);
Path reducePath = getPlanPath(conf, REDUCE_PLAN_NAME);

// if the plan path hasn't been initialized just return, nothing to clean.
if (mapPath == null  reducePath == null) {
  return;
}

try {
  FileSystem fs = mapPath.getFileSystem(conf);
{code}

If mapPath is null but reducePath is not null, getFileSystem() call would 
produce NPE

  was:
{code}
Path mapPath = getPlanPath(conf, MAP_PLAN_NAME);
Path reducePath = getPlanPath(conf, REDUCE_PLAN_NAME);

// if the plan path hasn't been initialized just return, nothing to clean.
if (mapPath == null  reducePath == null) {
  return;
}

try {
  FileSystem fs = mapPath.getFileSystem(conf);
{code}
If mapPath is null but reducePath is not null, getFileSystem() call would 
produce NPE


 Potential null dereference in Utilities#clearWork()
 ---

 Key: HIVE-8458
 URL: https://issues.apache.org/jira/browse/HIVE-8458
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Ted Yu
Assignee: skrho
Priority: Minor
 Attachments: HIVE-8458_001.patch


 {code}
 Path mapPath = getPlanPath(conf, MAP_PLAN_NAME);
 Path reducePath = getPlanPath(conf, REDUCE_PLAN_NAME);
 // if the plan path hasn't been initialized just return, nothing to clean.
 if (mapPath == null  reducePath == null) {
   return;
 }
 try {
   FileSystem fs = mapPath.getFileSystem(conf);
 {code}
 If mapPath is null but reducePath is not null, getFileSystem() call would 
 produce NPE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10454) Query against partitioned table in strict mode failed with No partition predicate found even if partition predicate is specified.

2015-04-23 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509254#comment-14509254
 ] 

Aihua Xu commented on HIVE-10454:
-

That condition doesn't check how many partitions will be involved. It's just 
reminding you that you need to provide predicates. 

We will always have such issue  with nondeterministic UDF like unix_timstamp(), 
even with the query like:

select * from t1 where t1.c2 = to_date(date_add(from_unixtime( unix_timestamp() 
),1));

For predicate with nondeterministic UDF, the predicate won't be pushed down to 
TableScanOperator, but currently we only check if TableScanOperator has 
predicate.

So we need not only check if TableScanOperator has predicates but also the 
child ops (e.g., FilterOperator) to determine if the table has predicate.  

 Query against partitioned table in strict mode failed with No partition 
 predicate found even if partition predicate is specified.
 ---

 Key: HIVE-10454
 URL: https://issues.apache.org/jira/browse/HIVE-10454
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu
Assignee: Aihua Xu

 The following queries fail:
 {noformat}
 create table t1 (c1 int) PARTITIONED BY (c2 string);
 set hive.mapred.mode=strict;
 select * from t1 where t1.c2  to_date(date_add(from_unixtime( 
 unix_timestamp() ),1));
 {noformat}
 The query failed with No partition predicate found for alias t1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10462) CBO (Calcite Return Path): Exception thrown in conversion to MapJoin

2015-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509251#comment-14509251
 ] 

Hive QA commented on HIVE-10462:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727612/HIVE-10462.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8728 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3544/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3544/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3544/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727612 - PreCommit-HIVE-TRUNK-Build

 CBO (Calcite Return Path): Exception thrown in conversion to MapJoin
 

 Key: HIVE-10462
 URL: https://issues.apache.org/jira/browse/HIVE-10462
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10462.patch


 When the return path is on, the mapjoin conversion optimization fails as some 
 DS in the Join descriptor have not been initialized properly.
 The failure can be reproduced with auto_join4.q. In particular, the following 
 Exception is thrown:
 {noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: Generate Map Join Task 
 Error: null
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinTaskDispatcher.processCurrentTask(CommonJoinTaskDispatcher.java:516)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:179)
 at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
 at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
 at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:79)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
 at 
 org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:203)

[jira] [Assigned] (HIVE-10463) CBO (Calcite Return Path): Insert overwrite... select * from... queries failing for bucketed tables

2015-04-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-10463:
--

Assignee: Jesus Camacho Rodriguez

 CBO (Calcite Return Path): Insert overwrite... select * from... queries 
 failing for bucketed tables
 ---

 Key: HIVE-10463
 URL: https://issues.apache.org/jira/browse/HIVE-10463
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0


 When return path is on. To reproduce the Exception, take the following 
 excerpt from auto_sortmerge_join_10.q:
 {noformat}
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 set hive.exec.reducers.max = 1;
 CREATE TABLE tbl1(key int, value string) CLUSTERED BY (key) SORTED BY (key) 
 INTO 2 BUCKETS;
 insert overwrite table tbl1
 select * from src where key  10;
 {noformat}
 It produces the following Exception:
 {noformat}
 java.lang.Exception: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
 Caused by: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 10 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:157)
 ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col1]
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:446)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:150)
 ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field key from [0:_col0, 
 1:_col1]
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:416)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:978)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:383)
 ... 22 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10463) CBO (Calcite Return Path): Insert overwrite... select * from... queries failing for bucketed tables

2015-04-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10463:
---
Assignee: (was: Jesus Camacho Rodriguez)

 CBO (Calcite Return Path): Insert overwrite... select * from... queries 
 failing for bucketed tables
 ---

 Key: HIVE-10463
 URL: https://issues.apache.org/jira/browse/HIVE-10463
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
 Fix For: 1.2.0


 When return path is on. To reproduce the Exception, take the following 
 excerpt from auto_sortmerge_join_10.q:
 {noformat}
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 set hive.exec.reducers.max = 1;
 CREATE TABLE tbl1(key int, value string) CLUSTERED BY (key) SORTED BY (key) 
 INTO 2 BUCKETS;
 insert overwrite table tbl1
 select * from src where key  10;
 {noformat}
 It produces the following Exception:
 {noformat}
 java.lang.Exception: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
 Caused by: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 10 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:157)
 ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col1]
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:446)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:150)
 ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field key from [0:_col0, 
 1:_col1]
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:416)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:978)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:383)
 ... 22 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10463) CBO (Calcite Return Path): Insert overwrite... select * from... queries failing for bucketed tables

2015-04-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10463:
---
Assignee: (was: Jesus Camacho Rodriguez)

 CBO (Calcite Return Path): Insert overwrite... select * from... queries 
 failing for bucketed tables
 ---

 Key: HIVE-10463
 URL: https://issues.apache.org/jira/browse/HIVE-10463
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
 Fix For: 1.2.0


 When return path is on. To reproduce the Exception, take the following 
 excerpt from auto_sortmerge_join_10.q:
 {noformat}
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 set hive.exec.reducers.max = 1;
 CREATE TABLE tbl1(key int, value string) CLUSTERED BY (key) SORTED BY (key) 
 INTO 2 BUCKETS;
 insert overwrite table tbl1
 select * from src where key  10;
 {noformat}
 It produces the following Exception:
 {noformat}
 java.lang.Exception: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
 Caused by: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 10 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:157)
 ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col1]
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:446)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:150)
 ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field key from [0:_col0, 
 1:_col1]
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:416)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:978)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:383)
 ... 22 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9957) Hive 1.1.0 not compatible with Hadoop 2.4.0

2015-04-23 Thread subhashmv (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509420#comment-14509420
 ] 

subhashmv commented on HIVE-9957:
-

Actually i installed hive 1.0 version and hadoop 2.5 version it is saying that 
it is incompatible with the software so please tell me how to apply the patch i 
can't understand in the link provided by Lefty

 Hive 1.1.0 not compatible with Hadoop 2.4.0
 ---

 Key: HIVE-9957
 URL: https://issues.apache.org/jira/browse/HIVE-9957
 Project: Hive
  Issue Type: Bug
  Components: Encryption
Reporter: Vivek Shrivastava
Assignee: Sergio Peña
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9957.1.patch


 Getting this exception while accessing data through Hive. 
 Exception in thread main java.lang.NoSuchMethodError: 
 org.apache.hadoop.hdfs.DFSClient.getKeyProvider()Lorg/apache/hadoop/crypto/key/KeyProvider;
 at 
 org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.init(Hadoop23Shims.java:1152)
 at 
 org.apache.hadoop.hive.shims.Hadoop23Shims.createHdfsEncryptionShim(Hadoop23Shims.java:1279)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.getHdfsEncryptionShim(SessionState.java:392)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:1756)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:1875)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1689)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1427)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10132)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-23 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509570#comment-14509570
 ] 

Aihua Xu commented on HIVE-9917:


Thanks [~jdere].

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Fix For: 1.2.0

 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10413) [CBO] Return path assumes distinct column cant be same as grouping column

2015-04-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509346#comment-14509346
 ] 

Ashutosh Chauhan commented on HIVE-10413:
-

In place of findIn(), ExprNodeDescUtils::indexOf can be used.
Other than look good. +1 Tested with given queries.

 [CBO] Return path assumes distinct column cant be same as grouping column
 -

 Key: HIVE-10413
 URL: https://issues.apache.org/jira/browse/HIVE-10413
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-10413.1.patch, HIVE-10413.2.patch, HIVE-10413.patch


 Found in cbo_udf_udaf.q tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9451) Add max size of column dictionaries to ORC metadata

2015-04-23 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HIVE-9451:
---

Assignee: Owen O'Malley

 Add max size of column dictionaries to ORC metadata
 ---

 Key: HIVE-9451
 URL: https://issues.apache.org/jira/browse/HIVE-9451
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 To predict the amount of memory required to read an ORC file we need to know 
 the size of the dictionaries for the columns that we are reading. I propose 
 adding the number of bytes for each column's dictionary to the stripe's 
 column statistics. The file's column statistics would have the maximum 
 dictionary size for each column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory

2015-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509385#comment-14509385
 ] 

Hive QA commented on HIVE-5672:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727622/HIVE-5672.5.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8729 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3545/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3545/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3545/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727622 - PreCommit-HIVE-TRUNK-Build

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7711) Error Serializing GenericUDF

2015-04-23 Thread ankush (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509422#comment-14509422
 ] 

ankush commented on HIVE-7711:
--

Could you please let me know how i find the kryo version ?

how i find the kryo version that i using ?

Please help

 Error Serializing GenericUDF
 

 Key: HIVE-7711
 URL: https://issues.apache.org/jira/browse/HIVE-7711
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Dr. Christian Betz
 Attachments: HIVE-7711.1.patch.txt


 I get an exception running a job with a GenericUDF in HIVE 0.13.0 (which was 
 ok in HIVE 0.12.0).
 The org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc is serialized 
 using Kryo, trying to serialize stuff in my GenericUDF which is not 
 serializable (doesn't implement Serializable).
 Switching to Kryo made the comment in ExprNodeGenericFuncDesc obsolte:
 /**
* In case genericUDF is Serializable, we will serialize the object.
*
* In case genericUDF does not implement Serializable, Java will remember 
 the
* class of genericUDF and creates a new instance when deserialized. This is
* exactly what we want.
*/
 Find the stacktrace below, however, the description above should be clear.
 Exception in thread main 
 org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.UnsupportedOperationException
 Serialization trace:
 value (java.util.concurrent.atomic.AtomicReference)
 state (clojure.lang.Atom)
 state (udfs.ArraySum)
 genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
 colExprMap (org.apache.hadoop.hive.ql.exec.SelectOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
 aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
 mapWork (org.apache.hadoop.hive.ql.plan.MapredWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 

[jira] [Commented] (HIVE-8343) Return value from BlockingQueue.offer() is not checked in DynamicPartitionPruner

2015-04-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509320#comment-14509320
 ] 

Ted Yu commented on HIVE-8343:
--

lgtm

 Return value from BlockingQueue.offer() is not checked in 
 DynamicPartitionPruner
 

 Key: HIVE-8343
 URL: https://issues.apache.org/jira/browse/HIVE-8343
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: JongWon Park
Priority: Minor
 Attachments: HIVE-8343.patch


 In addEvent() and processVertex(), there is call such as the following:
 {code}
   queue.offer(event);
 {code}
 The return value should be checked. If false is returned, event would not 
 have been queued.
 Take a look at line 328 in:
 http://fuseyism.com/classpath/doc/java/util/concurrent/LinkedBlockingQueue-source.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10456) Grace Hash Join should not load spilled partitions on abort

2015-04-23 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509487#comment-14509487
 ] 

Gunther Hagleitner commented on HIVE-10456:
---

This doesn't seem right - the clear method you use leaves a broken partition 
behind right? You clean up some stuff but don't nuke the whole container. I 
think the logic should be: if it has any spilled partitions, throw away the 
whole container and make sure it's not in the cache (shouldn't be). If there 
are no partitions spilled leave alone for reuse.

 Grace Hash Join should not load spilled partitions on abort
 ---

 Key: HIVE-10456
 URL: https://issues.apache.org/jira/browse/HIVE-10456
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10456.1.patch


 Grace Hash Join loads the spilled partitions to complete the join in 
 closeOp(). This should not happen when closeOp with abort is invoked. Instead 
 it should clean up all the spilled data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10464) How i find the kryo version

2015-04-23 Thread ankush (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509489#comment-14509489
 ] 

ankush commented on HIVE-10464:
---

thank you Lefty

 How i find the kryo version 
 

 Key: HIVE-10464
 URL: https://issues.apache.org/jira/browse/HIVE-10464
 Project: Hive
  Issue Type: Improvement
Reporter: ankush

 Could you please let me know how i find the kryo version that i using ?
 Please help me on this,
 We are just running HQL (Hive) queries



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10463) CBO (Calcite Return Path): Insert overwrite... select * from... queries failing for bucketed tables

2015-04-23 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10463:
--
Assignee: Laljo John Pullokkaran

 CBO (Calcite Return Path): Insert overwrite... select * from... queries 
 failing for bucketed tables
 ---

 Key: HIVE-10463
 URL: https://issues.apache.org/jira/browse/HIVE-10463
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0


 When return path is on. To reproduce the Exception, take the following 
 excerpt from auto_sortmerge_join_10.q:
 {noformat}
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 set hive.exec.reducers.max = 1;
 CREATE TABLE tbl1(key int, value string) CLUSTERED BY (key) SORTED BY (key) 
 INTO 2 BUCKETS;
 insert overwrite table tbl1
 select * from src where key  10;
 {noformat}
 It produces the following Exception:
 {noformat}
 java.lang.Exception: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
 Caused by: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 10 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:157)
 ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col1]
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:446)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:150)
 ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field key from [0:_col0, 
 1:_col1]
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:416)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:978)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:383)
 ... 22 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509317#comment-14509317
 ] 

Ashutosh Chauhan commented on HIVE-10416:
-

Is this patch ready for commit or does it need more work ?

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10464) How i find the kryo version

2015-04-23 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-10464.

Resolution: Invalid

 How i find the kryo version 
 

 Key: HIVE-10464
 URL: https://issues.apache.org/jira/browse/HIVE-10464
 Project: Hive
  Issue Type: Improvement
Reporter: ankush

 Could you please let me know how i find the kryo version that i using ?
 Please help me on this,
 We are just running HQL (Hive) queries



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10454) Query against partitioned table in strict mode failed with No partition predicate found even if partition predicate is specified.

2015-04-23 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509545#comment-14509545
 ] 

Aihua Xu commented on HIVE-10454:
-

In this query, I'm not filtering the rows but filtering the partitions, so we 
won't scan all the partitions. Strict mode by the definition should allow 
such query.  

 Query against partitioned table in strict mode failed with No partition 
 predicate found even if partition predicate is specified.
 ---

 Key: HIVE-10454
 URL: https://issues.apache.org/jira/browse/HIVE-10454
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu
Assignee: Aihua Xu

 The following queries fail:
 {noformat}
 create table t1 (c1 int) PARTITIONED BY (c2 string);
 set hive.mapred.mode=strict;
 select * from t1 where t1.c2  to_date(date_add(from_unixtime( 
 unix_timestamp() ),1));
 {noformat}
 The query failed with No partition predicate found for alias t1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5672) Insert with custom separator not supported for non-local directory

2015-04-23 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-5672:

Attachment: HIVE-5672.5.patch

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-23 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509322#comment-14509322
 ] 

Jesus Camacho Rodriguez commented on HIVE-10416:


[~ashutoshc], not yet, I need to discuss with John about his comment.

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10454) Query against partitioned table in strict mode failed with No partition predicate found even if partition predicate is specified.

2015-04-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509403#comment-14509403
 ] 

Xuefu Zhang commented on HIVE-10454:


I think the point of strict mode is to prevent full scan all partitions of a 
table. In your case, while rows are filtered, the scanner will have to scan all 
partitions, which should be prevented by the virtue of the strict mode.

 Query against partitioned table in strict mode failed with No partition 
 predicate found even if partition predicate is specified.
 ---

 Key: HIVE-10454
 URL: https://issues.apache.org/jira/browse/HIVE-10454
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu
Assignee: Aihua Xu

 The following queries fail:
 {noformat}
 create table t1 (c1 int) PARTITIONED BY (c2 string);
 set hive.mapred.mode=strict;
 select * from t1 where t1.c2  to_date(date_add(from_unixtime( 
 unix_timestamp() ),1));
 {noformat}
 The query failed with No partition predicate found for alias t1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]

2015-04-23 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-10302:
---
Attachment: HIVE-10302.2-spark.patch

 Load small tables (for map join) in executor memory only once [Spark Branch]
 

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.2-spark.patch, HIVE-10302.spark-1.patch


 Usually there are multiple cores in a Spark executor, and thus it's possible 
 that multiple map-join tasks can be running in the same executor 
 (concurrently or sequentially). Currently, each task will load its own copy 
 of the small tables for map join into memory, ending up with inefficiency. 
 Ideally, we only load the small tables once and share them among the tasks 
 running in that executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Affects Version/s: (was: 1.1.0)
   1.2.0

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2resultSetcompressor.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors.
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10464) How i find the kryo version

2015-04-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509480#comment-14509480
 ] 

Lefty Leverenz commented on HIVE-10464:
---

You can ask this question on the u...@hive.apache.org mailing list.

* [User Mailing List | http://hive.apache.org/mailing_lists.html]

 How i find the kryo version 
 

 Key: HIVE-10464
 URL: https://issues.apache.org/jira/browse/HIVE-10464
 Project: Hive
  Issue Type: Improvement
Reporter: ankush

 Could you please let me know how i find the kryo version that i using ?
 Please help me on this,
 We are just running HQL (Hive) queries



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10391) CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column

2015-04-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509578#comment-14509578
 ] 

Lefty Leverenz commented on HIVE-10391:
---

The commit gives the umbrella jira number (HIVE-9132) instead of this one 
(HIVE-10391), although the summary text is correct.

It's commit 5a576b6fbf1680ab4dd8f275cad484a2614ef2c1.

 CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter 
 does not include a partition column
 -

 Key: HIVE-10391
 URL: https://issues.apache.org/jira/browse/HIVE-10391
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10391.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10466) LLAP: fix container sizing configuration for memory

2015-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10466:

Description: 


We cannot use full machine for LLAP due to config for cache and executors being 
split brain... please refer to  [~gopalv] for details

  was:
This is [~sershe] impersonating :)

We cannot use full machine for LLAP due to config for cache and executors being 
split brain... please refer to  [~gopalv] for details


 LLAP: fix container sizing configuration for memory
 ---

 Key: HIVE-10466
 URL: https://issues.apache.org/jira/browse/HIVE-10466
 Project: Hive
  Issue Type: Sub-task
Reporter: Gopal V
Assignee: Vikram Dixit K

 We cannot use full machine for LLAP due to config for cache and executors 
 being split brain... please refer to  [~gopalv] for details



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10370) Hive does not compile with -Phadoop-1 option

2015-04-23 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509772#comment-14509772
 ] 

Vaibhav Gumashta commented on HIVE-10370:
-

+1

 Hive does not compile with -Phadoop-1 option
 

 Key: HIVE-10370
 URL: https://issues.apache.org/jira/browse/HIVE-10370
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-10370.1.patch


 Running into the below error while running mvn clean install -Pdist 
 -Phadoop-1
 {code}
 [ERROR]hive/serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleFast.java:[164,33]
  cannot find symbol
   symbol:   method copyBytes()
   location: variable serialized of type org.apache.hadoop.io.Text
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10443) HIVE-9870 broke hadoop-1 build

2015-04-23 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509771#comment-14509771
 ] 

Prasanth Jayachandran commented on HIVE-10443:
--

LGTM, +1.

 HIVE-9870 broke hadoop-1 build
 --

 Key: HIVE-10443
 URL: https://issues.apache.org/jira/browse/HIVE-10443
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Vaibhav Gumashta
 Fix For: 1.2.0

 Attachments: HIVE-10443.1.patch


 JvmPauseMonitor added in HIVE-9870 is breaking hadoop-1 build. 
 HiveServer2.startPauseMonitor() does not use reflection properly to start 
 JvmPauseMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10447) Beeline JDBC Driver to support 2 way SSL

2015-04-23 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10447:
-
Attachment: HIVE-10447.1.patch

cc-ing [~thejas] , [~vgumashta] for review.

 Beeline JDBC Driver to support 2 way SSL
 

 Key: HIVE-10447
 URL: https://issues.apache.org/jira/browse/HIVE-10447
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10447.1.patch


 This jira should cover 2-way SSL authentication between the JDBC Client and 
 server which requires the driver to support it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10465) whitelist restrictions don't get initialized in new copy of HiveConf

2015-04-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10465:
-
Summary: whitelist restrictions don't get initialized in new copy of 
HiveConf  (was: whitelist restrictions don't get initialized in initial part of 
session)

 whitelist restrictions don't get initialized in new copy of HiveConf
 

 Key: HIVE-10465
 URL: https://issues.apache.org/jira/browse/HIVE-10465
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair

 Whitelist restrictions use a regex pattern in HiveConf, but when a new 
 HiveConf object copy is created, the regex pattern is not initialized in the 
 new HiveConf copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10456) Grace Hash Join should not load spilled partitions on abort

2015-04-23 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509719#comment-14509719
 ] 

Gunther Hagleitner commented on HIVE-10456:
---

Summary of offline discussion w/ [~prasanth_j]:

- Probably best to check if hash table is in registry (on abort). If it is: 
Ownership is shared, no need to clean up. If it isn't: MapJoinOp owns the table 
container and need to clean (+ free mem).

 Grace Hash Join should not load spilled partitions on abort
 ---

 Key: HIVE-10456
 URL: https://issues.apache.org/jira/browse/HIVE-10456
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10456.1.patch


 Grace Hash Join loads the spilled partitions to complete the join in 
 closeOp(). This should not happen when closeOp with abort is invoked. Instead 
 it should clean up all the spilled data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]

2015-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509721#comment-14509721
 ] 

Hive QA commented on HIVE-10434:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727680/HIVE-10434.4-spark.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8721 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did 
not produce a TEST-*.xml file
TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not 
produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more
 - did not produce a TEST-*.xml file
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/834/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/834/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-834/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727680 - PreCommit-HIVE-SPARK-Build

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch, 
 HIVE-10434.4-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Description: 
This JIRA proposes an architecture for enabling ResultSet compression which 
uses an external plugin. 

The patch has three aspects to it: 
0. An architecture for enabling ResultSet compression with external plugins
1. An example plugin to demonstrate end-to-end functionality 
2. A container to allow everyone to write and test ResultSet compressors with a 
query submitter (https://github.com/xiaom/hs2driver) 

Also attaching a design document explaining the changes, experimental results 
document, and a pdf explaining how to setup the docker container to observe 
end-to-end functionality of ResultSet compression. 



  was:
This JIRA proposes an architecture for enabling ResultSet compression which 
uses an external plugin. 

The patch has three aspects to it: 
0. An architecture for enabling ResultSet compression with external plugins
1. An example plugin to demonstrate end-to-end functionality 
2. A container to allow everyone to write and test ResultSet compressors.

Also attaching a design document explaining the changes, experimental results 
document, and a pdf explaining how to setup the docker container to observe 
end-to-end functionality of ResultSet compression. 




 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2resultSetcompressor.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: hs2driver-master.zip

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: readme.txt

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10443) HIVE-9870 broke hadoop-1 build

2015-04-23 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10443:

Attachment: HIVE-10443.1.patch

 HIVE-9870 broke hadoop-1 build
 --

 Key: HIVE-10443
 URL: https://issues.apache.org/jira/browse/HIVE-10443
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10443.1.patch


 JvmPauseMonitor added in HIVE-9870 is breaking hadoop-1 build. 
 HiveServer2.startPauseMonitor() does not use reflection properly to start 
 JvmPauseMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10443) HIVE-9870 broke hadoop-1 build

2015-04-23 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10443:

Fix Version/s: 1.2.0

 HIVE-9870 broke hadoop-1 build
 --

 Key: HIVE-10443
 URL: https://issues.apache.org/jira/browse/HIVE-10443
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Vaibhav Gumashta
 Fix For: 1.2.0

 Attachments: HIVE-10443.1.patch


 JvmPauseMonitor added in HIVE-9870 is breaking hadoop-1 build. 
 HiveServer2.startPauseMonitor() does not use reflection properly to start 
 JvmPauseMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10227) Concrete implementation of Export/Import based ReplicationTaskFactory

2015-04-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509627#comment-14509627
 ] 

Lefty Leverenz commented on HIVE-10227:
---

Doc note:  HIVE-10264 will document everything related to replication, 
including the configuration parameter added here (*hive.repl.task.factory*).

 Concrete implementation of Export/Import based ReplicationTaskFactory
 -

 Key: HIVE-10227
 URL: https://issues.apache.org/jira/browse/HIVE-10227
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 1.2.0

 Attachments: HIVE-10227.2.patch, HIVE-10227.3.patch, 
 HIVE-10227.4.patch, HIVE-10227.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10443) HIVE-9870 broke hadoop-1 build

2015-04-23 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10443:

Attachment: (was: HIVE-10443.1.patch)

 HIVE-9870 broke hadoop-1 build
 --

 Key: HIVE-10443
 URL: https://issues.apache.org/jira/browse/HIVE-10443
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Vaibhav Gumashta

 JvmPauseMonitor added in HIVE-9870 is breaking hadoop-1 build. 
 HiveServer2.startPauseMonitor() does not use reflection properly to start 
 JvmPauseMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10443) HIVE-9870 broke hadoop-1 build

2015-04-23 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10443:

Attachment: HIVE-10443.1.patch

 HIVE-9870 broke hadoop-1 build
 --

 Key: HIVE-10443
 URL: https://issues.apache.org/jira/browse/HIVE-10443
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Vaibhav Gumashta

 JvmPauseMonitor added in HIVE-9870 is breaking hadoop-1 build. 
 HiveServer2.startPauseMonitor() does not use reflection properly to start 
 JvmPauseMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10423) HIVE-7948 breaks deploy_e2e_artifacts.sh

2015-04-23 Thread Aswathy Chellammal Sreekumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar resolved HIVE-10423.
-
Resolution: Fixed

 HIVE-7948 breaks deploy_e2e_artifacts.sh
 

 Key: HIVE-10423
 URL: https://issues.apache.org/jira/browse/HIVE-10423
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Aswathy Chellammal Sreekumar
 Attachments: HIVE-10423.patch


 HIVE-7948 added a step to download a ml-1m.zip file and unzip it.
 this only works if you call deploy_e2e_artifacts.sh once.  If you call it 
 again (which is very common in dev) it blocks and ask for additional input 
 from user because target files already exist.
 This needs to be changed similarly to what we discussed for HIVE-9272, i.e. 
 place artifacts not under source control in testdist/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]

2015-04-23 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10434:

Attachment: HIVE-10434.4-spark.patch

Addressing RB comments #2.

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch, 
 HIVE-10434.4-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]

2015-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509619#comment-14509619
 ] 

Hive QA commented on HIVE-10302:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727654/HIVE-10302.2-spark.patch

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 8721 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did 
not produce a TEST-*.xml file
TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not 
produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/833/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/833/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-833/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727654 - PreCommit-HIVE-SPARK-Build

 Load small tables (for map join) in executor memory only once [Spark Branch]
 

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.2-spark.patch, HIVE-10302.spark-1.patch


 Usually there are multiple cores in a Spark executor, and thus it's possible 
 that multiple map-join tasks can be running in the same executor 
 (concurrently or sequentially). Currently, each task will load its own copy 
 of the small tables for map join into memory, ending up with inefficiency. 
 Ideally, we only load the small tables once and share them among the tasks 
 running in that executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL

2015-04-23 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10239:
-
Attachment: HIVE-10239.03.patch

The Oracle installation continues to fail. apt-get update fails to download the 
oracle binaries. I added some debug to the script to list the content of 
/var/cache/apt/archives/
and it appears to be empty.

+ /bin/true
+ ls -al /var/cache/apt/archives
+ apt-get install -y --force-yes oracle-xe:i386

For some reason it is unable to download even the 32-bit binaries. 

For now, I am isolating the changes for postgres + derby from the oracle 
changes. I will continue to investigate the oracle script. I will file a new 
jira for this.

 Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and 
 PostgreSQL
 

 Key: HIVE-10239
 URL: https://issues.apache.org/jira/browse/HIVE-10239
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, 
 HIVE-10239.0.patch, HIVE-10239.00.patch, HIVE-10239.01.patch, 
 HIVE-10239.02.patch, HIVE-10239.03.patch, HIVE-10239.patch


 Need to create DB-implementation specific scripts to use the framework 
 introduced in HIVE-9800 to have any metastore schema changes tested across 
 all supported databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10456) Grace Hash Join should not load spilled partitions on abort

2015-04-23 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10456:
-
Attachment: HIVE-10456.2.patch

 Grace Hash Join should not load spilled partitions on abort
 ---

 Key: HIVE-10456
 URL: https://issues.apache.org/jira/browse/HIVE-10456
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10456.1.patch, HIVE-10456.2.patch


 Grace Hash Join loads the spilled partitions to complete the join in 
 closeOp(). This should not happen when closeOp with abort is invoked. Instead 
 it should clean up all the spilled data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10423) HIVE-7948 breaks deploy_e2e_artifacts.sh

2015-04-23 Thread Aswathy Chellammal Sreekumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509857#comment-14509857
 ] 

Aswathy Chellammal Sreekumar commented on HIVE-10423:
-

No, it is not committed yet.

 HIVE-7948 breaks deploy_e2e_artifacts.sh
 

 Key: HIVE-10423
 URL: https://issues.apache.org/jira/browse/HIVE-10423
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Aswathy Chellammal Sreekumar
 Attachments: HIVE-10423.patch


 HIVE-7948 added a step to download a ml-1m.zip file and unzip it.
 this only works if you call deploy_e2e_artifacts.sh once.  If you call it 
 again (which is very common in dev) it blocks and ask for additional input 
 from user because target files already exist.
 This needs to be changed similarly to what we discussed for HIVE-9272, i.e. 
 place artifacts not under source control in testdist/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10456) Grace Hash Join should not load spilled partitions on abort

2015-04-23 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10456:
-
Attachment: (was: HIVE-10456.2.patch)

 Grace Hash Join should not load spilled partitions on abort
 ---

 Key: HIVE-10456
 URL: https://issues.apache.org/jira/browse/HIVE-10456
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10456.1.patch


 Grace Hash Join loads the spilled partitions to complete the join in 
 closeOp(). This should not happen when closeOp with abort is invoked. Instead 
 it should clean up all the spilled data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions

2015-04-23 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509918#comment-14509918
 ] 

Chaoyu Tang commented on HIVE-10384:


[~szehon] I think currently the other check only needs the TTransportException 
since it is the only TException thrown from HiveMetaStoreClient.open during 
reconnect, no others. Thanks

 RetryingMetaStoreClient does not retry wrapped TTransportExceptions
 ---

 Key: HIVE-10384
 URL: https://issues.apache.org/jira/browse/HIVE-10384
 Project: Hive
  Issue Type: Bug
  Components: Clients
Reporter: Eric Liang
Assignee: Chaoyu Tang
 Attachments: HIVE-10384.1.patch, HIVE-10384.patch


 This bug is very similar to HIVE-9436, in that a TTransportException wrapped 
 in a MetaException will not be retried. RetryingMetaStoreClient has a block 
 of code above the MetaException handler that retries thrift exceptions, but 
 this doesn't work when the exception is wrapped.
 {code}
 if ((e.getCause() instanceof TApplicationException) ||
 (e.getCause() instanceof TProtocolException) ||
 (e.getCause() instanceof TTransportException)) {
   caughtException = (TException) e.getCause();
 } else if ((e.getCause() instanceof MetaException) 
 
 e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) {
   caughtException = (MetaException) e.getCause();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10419) can't do query on partitioned view with analytic function in strictmode

2015-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10419:

Attachment: (was: HIVE-10419.patch)

 can't do query on partitioned view with analytic function in strictmode
 ---

 Key: HIVE-10419
 URL: https://issues.apache.org/jira/browse/HIVE-10419
 Project: Hive
  Issue Type: Bug
  Components: Hive, Views
Affects Versions: 0.13.0, 0.14.0, 1.0.0
 Environment: Cloudera 5.3.x. 
Reporter: Hector Lagos

 Hey Guys,
 I created the following table:
 CREATE TABLE t1 (id int, key string, value string) partitioned by (dt int);
 And after that i created a view on that table as follow:
 create view v1 PARTITIONED ON (dt)
 as
 SELECT * FROM (
 SELECT row_number() over (partition by key order by value asc) as row_n, * 
 FROM t1 
 ) t WHERE row_n = 1;
 We are working with hive.mapred.mode=strict and when I try to do the  query 
 select * from v1 where dt = 2 , I'm getting the following error:
 FAILED: SemanticException [Error 10041]: No partition predicate found for 
 Alias v1:t:t1 Table t1
 Is this a bug or a limitation of Hive when you use analytic functions in 
 partitioned views? If i remove the row_number function it works without 
 problems. 
 Thanks in advance, any help will be appreciated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10454) Query against partitioned table in strict mode failed with No partition predicate found even if partition predicate is specified.

2015-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10454:

Attachment: HIVE-10454.patch

 Query against partitioned table in strict mode failed with No partition 
 predicate found even if partition predicate is specified.
 ---

 Key: HIVE-10454
 URL: https://issues.apache.org/jira/browse/HIVE-10454
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10454.patch


 The following queries fail:
 {noformat}
 create table t1 (c1 int) PARTITIONED BY (c2 string);
 set hive.mapred.mode=strict;
 select * from t1 where t1.c2  to_date(date_add(from_unixtime( 
 unix_timestamp() ),1));
 {noformat}
 The query failed with No partition predicate found for alias t1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10456) Grace Hash Join should not load spilled partitions on abort

2015-04-23 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10456:
-
Attachment: HIVE-10456.2.patch

 Grace Hash Join should not load spilled partitions on abort
 ---

 Key: HIVE-10456
 URL: https://issues.apache.org/jira/browse/HIVE-10456
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10456.1.patch, HIVE-10456.1.patch, 
 HIVE-10456.2.patch


 Grace Hash Join loads the spilled partitions to complete the join in 
 closeOp(). This should not happen when closeOp with abort is invoked. Instead 
 it should clean up all the spilled data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10413) [CBO] Return path assumes distinct column cant be same as grouping column

2015-04-23 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10413:
--
Attachment: HIVE-10413.3.patch

 [CBO] Return path assumes distinct column cant be same as grouping column
 -

 Key: HIVE-10413
 URL: https://issues.apache.org/jira/browse/HIVE-10413
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-10413.1.patch, HIVE-10413.2.patch, 
 HIVE-10413.3.patch, HIVE-10413.patch


 Found in cbo_udf_udaf.q tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8459) DbLockManager locking table in addition to partitions

2015-04-23 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510001#comment-14510001
 ] 

Alan Gates commented on HIVE-8459:
--

Changed this to minor as an extra shared lock on the table makes no semantic 
difference.  The lock on the partition would block any xlocks on the table 
anyway, and a read lock doesn't block other read locks or semi-shared locks.

 DbLockManager locking table in addition to partitions
 -

 Key: HIVE-8459
 URL: https://issues.apache.org/jira/browse/HIVE-8459
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Minor

 Queries and operations on partitioned tables are generating locks on the 
 whole table when they should only be locking the partition.  For example:
 {code}
 select count(*) from concur_orc_tab_part where ds = 'today';
 {code}
 This should only be locking the partition ds='today'.  But instead:
 {code}
 mysql select * from HIVE_LOCKS;
 +++--+-+-+--+---+--+---++-++
 | HL_LOCK_EXT_ID | HL_LOCK_INT_ID | HL_TXNID | HL_DB   | HL_TABLE
 | HL_PARTITION | HL_LOCK_STATE | HL_LOCK_TYPE | HL_LAST_HEARTBEAT | 
 HL_ACQUIRED_AT | HL_USER | HL_HOST|
 +++--+-+-+--+---+--+---++-++
 |428 |  1 |0 | default | concur_orc_tab_part 
 | NULL | a | r| 1413311172000 |  
 1413311171000 | hive| node-1.example.com |
 |428 |  2 |0 | default | concur_orc_tab_part 
 | ds=today | a | r| 1413311172000 |  
 1413311171000 | hive| node-1.example.com |
 +++--+-+-+--+---+--+---++-++
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10467) Switch to GIT repository on Jenkins precommit tests

2015-04-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10467:
---
Attachment: HIVE-10467.1.patch

 Switch to GIT repository on Jenkins precommit tests 
 

 Key: HIVE-10467
 URL: https://issues.apache.org/jira/browse/HIVE-10467
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10467.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10467) Switch to GIT repository on Jenkins precommit tests

2015-04-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509785#comment-14509785
 ] 

Sergio Peña commented on HIVE-10467:


[~szehon] Could you review this small fix?

 Switch to GIT repository on Jenkins precommit tests 
 

 Key: HIVE-10467
 URL: https://issues.apache.org/jira/browse/HIVE-10467
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10467.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]

2015-04-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10434:
---
Attachment: HIVE-10434.4-spark.patch

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch, 
 HIVE-10434.4-spark.patch, HIVE-10434.4-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10467) Switch to GIT repository on Jenkins precommit tests

2015-04-23 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509796#comment-14509796
 ] 

Szehon Ho commented on HIVE-10467:
--

+1 thanks for taking care of this

 Switch to GIT repository on Jenkins precommit tests 
 

 Key: HIVE-10467
 URL: https://issues.apache.org/jira/browse/HIVE-10467
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10467.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10217) LLAP: Support caching of uncompressed ORC data

2015-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10217:

Attachment: HIVE-10127.patch

 LLAP: Support caching of uncompressed ORC data
 --

 Key: HIVE-10217
 URL: https://issues.apache.org/jira/browse/HIVE-10217
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Fix For: llap

 Attachments: HIVE-10127.patch


 {code}
 Caused by: java.io.IOException: ORC compression buffer size (0) is smaller 
 than LLAP low-level cache minimum allocation size (131072). Decrease the 
 value for hive.llap.io.cache.orc.alloc.min
 at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:137)
 at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:48)
 at 
 org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
 ... 4 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10426) Rework/simplify ReplicationTaskFactory instantiation

2015-04-23 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10426:

Attachment: 10246.out

The test failures are due to pre-commit tests trying to still work against svn 
instead of git.

This patch affects only a local part of hcat, so I've attached a test output 
for the same which shows compilation succeeding across the board, and all 
affected tests succeeding.

 Rework/simplify ReplicationTaskFactory instantiation
 

 Key: HIVE-10426
 URL: https://issues.apache.org/jira/browse/HIVE-10426
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: 10246.out, HIVE-10426.patch


 Creating a new jira to continue discussions from HIVE-10227 as to what 
 ReplicationTask.Factory instantiation should look like.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]

2015-04-23 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-9152:
---
Attachment: HIVE-9152.3-spark.patch

Restarted working on this JIRA. Rebased and regenerated the old patch.

 Dynamic Partition Pruning [Spark Branch]
 

 Key: HIVE-9152
 URL: https://issues.apache.org/jira/browse/HIVE-9152
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Brock Noland
Assignee: Chao Sun
 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.2-spark.patch, 
 HIVE-9152.3-spark.patch


 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10423) HIVE-7948 breaks deploy_e2e_artifacts.sh

2015-04-23 Thread Aswathy Chellammal Sreekumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509861#comment-14509861
 ] 

Aswathy Chellammal Sreekumar commented on HIVE-10423:
-

I think i marked it resolved mistakenly

 HIVE-7948 breaks deploy_e2e_artifacts.sh
 

 Key: HIVE-10423
 URL: https://issues.apache.org/jira/browse/HIVE-10423
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Aswathy Chellammal Sreekumar
 Attachments: HIVE-10423.patch


 HIVE-7948 added a step to download a ml-1m.zip file and unzip it.
 this only works if you call deploy_e2e_artifacts.sh once.  If you call it 
 again (which is very common in dev) it blocks and ask for additional input 
 from user because target files already exist.
 This needs to be changed similarly to what we discussed for HIVE-9272, i.e. 
 place artifacts not under source control in testdist/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory

2015-04-23 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509927#comment-14509927
 ] 

Sushanth Sowmyan commented on HIVE-5672:


Agree with Xuefu - the grammar might be simplified by simply making KW_LOCAL 
optional, since there is no other place in hive code that seems to be making 
use of TOK_LOCAL_DIR.

To wit, we could have :

{code}
KW_LOCAL? KW_DIRECTORY StringLiteral tableRowFormat? tableFileFormat? - 
^(TOK_DIR StringLiteral tableRowFormat? tableFileFormat?)
{code}

This does mean that if at some point, we do still want to have a 
differentiation between local and non-local writes, we have to go back to 
Nemon's approach, and his approach is definitely the least damage done 
approach of not trying to remove something that already exists, so his patch 
makes sense from that point-of-view.

We have two approaches here, and I'm +1 on both:

a) Nemon's approach
b) Xuefu's suggestion : to make KW_LOCAL optional, and then emiting a TOK_DIR 
instead of a TOK_LOCAL_DIR for that line, and removing any other occurrences of 
TOK_LOCAL_DIR. We might eventually add it back because we want it, but it's 
duplicate code pruning in the meanwhile.

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10456) Grace Hash Join should not load spilled partitions on abort

2015-04-23 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509960#comment-14509960
 ] 

Prasanth Jayachandran commented on HIVE-10456:
--

[~hagleitn] Can you take a look again?

 Grace Hash Join should not load spilled partitions on abort
 ---

 Key: HIVE-10456
 URL: https://issues.apache.org/jira/browse/HIVE-10456
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10456.1.patch, HIVE-10456.1.patch, 
 HIVE-10456.2.patch


 Grace Hash Join loads the spilled partitions to complete the join in 
 closeOp(). This should not happen when closeOp with abort is invoked. Instead 
 it should clean up all the spilled data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8521) Document the ORC format

2015-04-23 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HIVE-8521.
-
Resolution: Fixed

This was added to the wiki.

 Document the ORC format
 ---

 Key: HIVE-8521
 URL: https://issues.apache.org/jira/browse/HIVE-8521
 Project: Hive
  Issue Type: Bug
  Components: Documentation, File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: orc-spec.pdf


 It is past time that we document the ORC file format. I've started and should 
 have a first pass this week.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10423) HIVE-7948 breaks deploy_e2e_artifacts.sh

2015-04-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509816#comment-14509816
 ] 

Eugene Koifman commented on HIVE-10423:
---

[~asreekumar], did someone commit this patch?

 HIVE-7948 breaks deploy_e2e_artifacts.sh
 

 Key: HIVE-10423
 URL: https://issues.apache.org/jira/browse/HIVE-10423
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Aswathy Chellammal Sreekumar
 Attachments: HIVE-10423.patch


 HIVE-7948 added a step to download a ml-1m.zip file and unzip it.
 this only works if you call deploy_e2e_artifacts.sh once.  If you call it 
 again (which is very common in dev) it blocks and ask for additional input 
 from user because target files already exist.
 This needs to be changed similarly to what we discussed for HIVE-9272, i.e. 
 place artifacts not under source control in testdist/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10467) Switch to GIT repository on Jenkins precommit tests

2015-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509867#comment-14509867
 ] 

Sergey Shelukhin commented on HIVE-10467:
-

Thanks!

 Switch to GIT repository on Jenkins precommit tests 
 

 Key: HIVE-10467
 URL: https://issues.apache.org/jira/browse/HIVE-10467
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
 Fix For: 1.2.0

 Attachments: HIVE-10467.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10456) Grace Hash Join should not load spilled partitions on abort

2015-04-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10456:
---
Attachment: HIVE-10456.1.patch

 Grace Hash Join should not load spilled partitions on abort
 ---

 Key: HIVE-10456
 URL: https://issues.apache.org/jira/browse/HIVE-10456
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10456.1.patch, HIVE-10456.1.patch


 Grace Hash Join loads the spilled partitions to complete the join in 
 closeOp(). This should not happen when closeOp with abort is invoked. Instead 
 it should clean up all the spilled data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10470) LLAP: NPE in IO when returning 0 rows with no projection

2015-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10470:

Description: 
Looks like a trivial fix, unless I'm missing something. I may do it later if 
you don't ;)

{noformat}
aused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.io.orc.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1764)
at 
org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:92)
at 
org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:39)
at 
org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:116)
at 
org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:36)
at 
org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:329)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:299)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:55)
at 
org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
... 4 more
{noformat}

  was:Looks like a trivial fix, unless I'm missing something. I may do it later 
if you don't ;)


 LLAP: NPE in IO when returning 0 rows with no projection
 

 Key: HIVE-10470
 URL: https://issues.apache.org/jira/browse/HIVE-10470
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Prasanth Jayachandran

 Looks like a trivial fix, unless I'm missing something. I may do it later if 
 you don't ;)
 {noformat}
 aused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.orc.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1764)
   at 
 org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:92)
   at 
 org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:39)
   at 
 org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:116)
   at 
 org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:36)
   at 
 org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:329)
   at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:299)
   at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:55)
   at 
 org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
   ... 4 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9451) Add max size of column dictionaries to ORC metadata

2015-04-23 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-9451:

Attachment: HIVE-9451.patch

This patch adds the maxDictionarySize and configured stripe size to the 
metadata of ORC files. I'll need to update the expected results for the qfiles 
that depend on the size of orc files.

 Add max size of column dictionaries to ORC metadata
 ---

 Key: HIVE-9451
 URL: https://issues.apache.org/jira/browse/HIVE-9451
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 1.2.0

 Attachments: HIVE-9451.patch


 To predict the amount of memory required to read an ORC file we need to know 
 the size of the dictionaries for the columns that we are reading. I propose 
 adding the number of bytes for each column's dictionary to the stripe's 
 column statistics. The file's column statistics would have the maximum 
 dictionary size for each column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10466) LLAP: fix container sizing configuration for memory

2015-04-23 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509794#comment-14509794
 ] 

Gopal V commented on HIVE-10466:


The current memory script does this - it uses NODE_MEM/2 as the 
per-instance-executor  uses 1Gb by default as the cache.

Then it goes through a bunch of complex heuristics to produce a complete 
configuration listing which contains the YARN container.size, the Xmx and the 
total memory allocated to executors.
 
https://github.com/apache/hive/blob/llap/llap-server/src/main/resources/package.py#L14

This produces a workable configuration, but it misses the total capacity of the 
node by a significant margin (will be 60%, so no double allocs on a single 
node, but will be  100%).

Even in that script, the yarn min-alloc is missing from the computation, so as 
we edge closer to the line, the harder it gets to configure this correctly.

After that, there's the whole YARN reserved memory fraction to deal with in 
this, so that we can avoid taking up memory in YARN that LLAP can't use.

 LLAP: fix container sizing configuration for memory
 ---

 Key: HIVE-10466
 URL: https://issues.apache.org/jira/browse/HIVE-10466
 Project: Hive
  Issue Type: Sub-task
Reporter: Gopal V
Assignee: Vikram Dixit K

 We cannot use full machine for LLAP due to config for cache and executors 
 being split brain... please refer to  [~gopalv] for details



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9957) Hive 1.1.0 not compatible with Hadoop 2.4.0

2015-04-23 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509790#comment-14509790
 ] 

Thejas M Nair commented on HIVE-9957:
-

[~subhashmv] What error do you get with 1.0 and hadoop 2.5 ? I wasn't aware of 
such an issue.

To apply the patch, you need to checkout the code, apply the patch and build 
new hive package (tar.gz).
Additional build instructions are here 
-https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ

 Hive 1.1.0 not compatible with Hadoop 2.4.0
 ---

 Key: HIVE-9957
 URL: https://issues.apache.org/jira/browse/HIVE-9957
 Project: Hive
  Issue Type: Bug
  Components: Encryption
Reporter: Vivek Shrivastava
Assignee: Sergio Peña
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9957.1.patch


 Getting this exception while accessing data through Hive. 
 Exception in thread main java.lang.NoSuchMethodError: 
 org.apache.hadoop.hdfs.DFSClient.getKeyProvider()Lorg/apache/hadoop/crypto/key/KeyProvider;
 at 
 org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.init(Hadoop23Shims.java:1152)
 at 
 org.apache.hadoop.hive.shims.Hadoop23Shims.createHdfsEncryptionShim(Hadoop23Shims.java:1279)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.getHdfsEncryptionShim(SessionState.java:392)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:1756)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:1875)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1689)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1427)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10132)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions

2015-04-23 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509900#comment-14509900
 ] 

Szehon Ho commented on HIVE-10384:
--

Hey Chaoyu, new patch looks good, I'm just wondering does the other check need 
all that TApplication|TProtocol|TTransport check?

 RetryingMetaStoreClient does not retry wrapped TTransportExceptions
 ---

 Key: HIVE-10384
 URL: https://issues.apache.org/jira/browse/HIVE-10384
 Project: Hive
  Issue Type: Bug
  Components: Clients
Reporter: Eric Liang
Assignee: Chaoyu Tang
 Attachments: HIVE-10384.1.patch, HIVE-10384.patch


 This bug is very similar to HIVE-9436, in that a TTransportException wrapped 
 in a MetaException will not be retried. RetryingMetaStoreClient has a block 
 of code above the MetaException handler that retries thrift exceptions, but 
 this doesn't work when the exception is wrapped.
 {code}
 if ((e.getCause() instanceof TApplicationException) ||
 (e.getCause() instanceof TProtocolException) ||
 (e.getCause() instanceof TTransportException)) {
   caughtException = (TException) e.getCause();
 } else if ((e.getCause() instanceof MetaException) 
 
 e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) {
   caughtException = (MetaException) e.getCause();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8459) DbLockManager locking table in addition to partitions

2015-04-23 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8459:
-
Priority: Minor  (was: Major)

 DbLockManager locking table in addition to partitions
 -

 Key: HIVE-8459
 URL: https://issues.apache.org/jira/browse/HIVE-8459
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Minor

 Queries and operations on partitioned tables are generating locks on the 
 whole table when they should only be locking the partition.  For example:
 {code}
 select count(*) from concur_orc_tab_part where ds = 'today';
 {code}
 This should only be locking the partition ds='today'.  But instead:
 {code}
 mysql select * from HIVE_LOCKS;
 +++--+-+-+--+---+--+---++-++
 | HL_LOCK_EXT_ID | HL_LOCK_INT_ID | HL_TXNID | HL_DB   | HL_TABLE
 | HL_PARTITION | HL_LOCK_STATE | HL_LOCK_TYPE | HL_LAST_HEARTBEAT | 
 HL_ACQUIRED_AT | HL_USER | HL_HOST|
 +++--+-+-+--+---+--+---++-++
 |428 |  1 |0 | default | concur_orc_tab_part 
 | NULL | a | r| 1413311172000 |  
 1413311171000 | hive| node-1.example.com |
 |428 |  2 |0 | default | concur_orc_tab_part 
 | ds=today | a | r| 1413311172000 |  
 1413311171000 | hive| node-1.example.com |
 +++--+-+-+--+---+--+---++-++
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10465) whitelist restrictions don't get initialized in new copy of HiveConf

2015-04-23 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510010#comment-14510010
 ] 

Daniel Dai commented on HIVE-10465:
---

LGTM, +1.

 whitelist restrictions don't get initialized in new copy of HiveConf
 

 Key: HIVE-10465
 URL: https://issues.apache.org/jira/browse/HIVE-10465
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-10465.1.patch


 Whitelist restrictions use a regex pattern in HiveConf, but when a new 
 HiveConf object copy is created, the regex pattern is not initialized in the 
 new HiveConf copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2015-04-23 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10468:
-
Description: This JIRA is to isolate the work specific to Oracle DB in 
HIVE-10239. Because of absence of 64 bit debian packages for oracle-xe, the 
apt-get install fails on the AWS systems.

 Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
 --

 Key: HIVE-10468
 URL: https://issues.apache.org/jira/browse/HIVE-10468
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 1.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam

 This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
 of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
 on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10442) HIVE-10098 broke hadoop-1 build

2015-04-23 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-10442:
---

Assignee: Yongzhi Chen

 HIVE-10098 broke hadoop-1 build
 ---

 Key: HIVE-10442
 URL: https://issues.apache.org/jira/browse/HIVE-10442
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Yongzhi Chen

 fs.addDelegationTokens() method does not seem to exist in hadoop 1.2.1. This 
 breaks the hadoop-1 builds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10467) Switch to GIT repository on Jenkins precommit tests

2015-04-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509895#comment-14509895
 ] 

Sergio Peña commented on HIVE-10467:


This patch was just part of it.
There are some properties files on the jenkins instance where this need to be 
change as well.

All *.properties file that exist on /usr/local/hiveptest/etc/public have the 
following lines that must be changed:

{noformat}
repositoryType = git
repository = https://git-wip-us.apache.org/repos/asf/hive.git
repositoryName = apache-git-master
{noformat}

I will leave this for future reference.

 Switch to GIT repository on Jenkins precommit tests 
 

 Key: HIVE-10467
 URL: https://issues.apache.org/jira/browse/HIVE-10467
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
 Fix For: 1.2.0

 Attachments: HIVE-10467.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL

2015-04-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10239:
---
Attachment: HIVE-10239.03.patch

 Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and 
 PostgreSQL
 

 Key: HIVE-10239
 URL: https://issues.apache.org/jira/browse/HIVE-10239
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, 
 HIVE-10239.0.patch, HIVE-10239.00.patch, HIVE-10239.01.patch, 
 HIVE-10239.02.patch, HIVE-10239.03.patch, HIVE-10239.03.patch, 
 HIVE-10239.patch


 Need to create DB-implementation specific scripts to use the framework 
 introduced in HIVE-9800 to have any metastore schema changes tested across 
 all supported databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10443) HIVE-9870 broke hadoop-1 build

2015-04-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10443:
---
Attachment: HIVE-10443.1.patch

 HIVE-9870 broke hadoop-1 build
 --

 Key: HIVE-10443
 URL: https://issues.apache.org/jira/browse/HIVE-10443
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Vaibhav Gumashta
 Fix For: 1.2.0

 Attachments: HIVE-10443.1.patch, HIVE-10443.1.patch


 JvmPauseMonitor added in HIVE-9870 is breaking hadoop-1 build. 
 HiveServer2.startPauseMonitor() does not use reflection properly to start 
 JvmPauseMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10419) can't do query on partitioned view with analytic function in strictmode

2015-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10419:

Attachment: HIVE-10419.patch

 can't do query on partitioned view with analytic function in strictmode
 ---

 Key: HIVE-10419
 URL: https://issues.apache.org/jira/browse/HIVE-10419
 Project: Hive
  Issue Type: Bug
  Components: Hive, Views
Affects Versions: 0.13.0, 0.14.0, 1.0.0
 Environment: Cloudera 5.3.x. 
Reporter: Hector Lagos

 Hey Guys,
 I created the following table:
 CREATE TABLE t1 (id int, key string, value string) partitioned by (dt int);
 And after that i created a view on that table as follow:
 create view v1 PARTITIONED ON (dt)
 as
 SELECT * FROM (
 SELECT row_number() over (partition by key order by value asc) as row_n, * 
 FROM t1 
 ) t WHERE row_n = 1;
 We are working with hive.mapred.mode=strict and when I try to do the  query 
 select * from v1 where dt = 2 , I'm getting the following error:
 FAILED: SemanticException [Error 10041]: No partition predicate found for 
 Alias v1:t:t1 Table t1
 Is this a bug or a limitation of Hive when you use analytic functions in 
 partitioned views? If i remove the row_number function it works without 
 problems. 
 Thanks in advance, any help will be appreciated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10472) Jenkins HMS upgrade test is not publishing results due to GIT change

2015-04-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10472:
---
Attachment: HIVE-10472.1.patch

[~szehon] Here's another tiny fix for GIT so that HMS upgrade tests can publish 
the results. 

 Jenkins HMS upgrade test is not publishing results due to GIT change
 

 Key: HIVE-10472
 URL: https://issues.apache.org/jira/browse/HIVE-10472
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10472.1.patch


 This error is happening on Jenkins when running the HMS upgrade tests. 
 The class used to publish the results is not found on any directory.
 + cd /var/lib/jenkins/jobs/PreCommit-HIVE-METASTORE-Test/workspace
 + set +x
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hive/ptest/execution/JIRAService
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hive.ptest.execution.JIRAService
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hive.ptest.execution.JIRAService.  
 Program will exit.
 + ret=0
 The problem is because the jenkins-execute-hms-test.sh is downloading the 
 code to another directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory

2015-04-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509440#comment-14509440
 ] 

Xuefu Zhang commented on HIVE-5672:
---

Looking at the patch, I'm not sure if I understand the changes correctly. I can 
see that we modified the grammar to make local optional and the rest is about 
refactoring. I'm not sure if this is sufficient. Did I miss anything?

Also, instead of adding a new grammar rule, we should combine it with the old 
one. We just need to make KW_LOCAL optional.

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10464) How i find the kryo version

2015-04-23 Thread ankush (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509493#comment-14509493
 ] 

ankush commented on HIVE-10464:
---

i ask the que on the u...@hive.apache.org mailing list.

Thank you

 How i find the kryo version 
 

 Key: HIVE-10464
 URL: https://issues.apache.org/jira/browse/HIVE-10464
 Project: Hive
  Issue Type: Improvement
Reporter: ankush

 Could you please let me know how i find the kryo version that i using ?
 Please help me on this,
 We are just running HQL (Hive) queries



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]

2015-04-23 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10434:

Attachment: HIVE-10434.3-spark.patch

Addressing RB comments.

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10347) Merge spark to trunk 4/15/2015

2015-04-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510449#comment-14510449
 ] 

Lefty Leverenz commented on HIVE-10347:
---

Doc note:  TODOC-SPARK labels are on the individual JIRA issues.  I'll add 
TODOC-1.2 labels too.

 Merge spark to trunk 4/15/2015
 --

 Key: HIVE-10347
 URL: https://issues.apache.org/jira/browse/HIVE-10347
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 1.2.0

 Attachments: HIVE-10347.2.patch, HIVE-10347.2.patch, 
 HIVE-10347.3.patch, HIVE-10347.4.patch, HIVE-10347.5.patch, 
 HIVE-10347.5.patch, HIVE-10347.6.patch, HIVE-10347.6.patch, HIVE-10347.patch


 CLEAR LIBRARY CACHE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10465) whitelist restrictions don't get initialized in new copy of HiveConf

2015-04-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10465:
-
Attachment: HIVE-10465.1.patch

 whitelist restrictions don't get initialized in new copy of HiveConf
 

 Key: HIVE-10465
 URL: https://issues.apache.org/jira/browse/HIVE-10465
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-10465.1.patch


 Whitelist restrictions use a regex pattern in HiveConf, but when a new 
 HiveConf object copy is created, the regex pattern is not initialized in the 
 new HiveConf copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4625) HS2 should not attempt to get delegation token from metastore if using embedded metastore

2015-04-23 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509746#comment-14509746
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-4625:
-

The test failures are unrelated to the change.

Thanks
Hari

 HS2 should not attempt to get delegation token from metastore if using 
 embedded metastore
 -

 Key: HIVE-4625
 URL: https://issues.apache.org/jira/browse/HIVE-4625
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-4625.1.patch, HIVE-4625.2.patch, HIVE-4625.3.patch, 
 HIVE-4625.4.patch, HIVE-4625.5.patch


 In kerberos secure mode, with doas enabled, Hive server2 tries to get 
 delegation token from metastore even if the metastore is being used in 
 embedded mode. 
 To avoid failure in that case, it uses catch block for 
 UnsupportedOperationException thrown that does nothing. But this leads to an 
 error being logged  by lower levels and can mislead users into thinking that 
 there is a problem.
 It should check if delegation token mode is supported with current 
 configuration before calling the function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow

2015-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510047#comment-14510047
 ] 

Sergey Shelukhin commented on HIVE-10474:
-

[~hagleitn] [~sseth] [~gopalv] fyi

 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, tez with container reuse (current default LLAP configuration but 
 mode == container and no daemons running)  runs 2-6 (out of 6 consecutive 
 runs in the same session) finished in 25.5sec average; with 16 LLAP daemons 
 in default config the average was 35.5sec; same w/o IO elevator (to rule out 
 its impact) it took 59.7sec w/strange distribution (later runs were slower 
 than earlier runs, still, fastest run was 49.5sec).
 So excluding IO elevator it's more than 2x degradation.
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow

2015-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10474:

Description: 
While most queries run faster in LLAP than just Tez with container reuse, TPCH 
Q1 is much slower.
On my run, on tez with container reuse (current default LLAP configuration but 
mode == container and no daemons running)  runs 2-6 (out of 6 consecutive runs 
in the same session) finished in 25.5sec average; with 16 LLAP daemons in 
default config the average was 35.5sec; same w/o IO elevator (to rule out its 
impact) it took 59.7sec w/strange distribution (later runs were slower than 
earlier runs, still, fastest run was 49.5sec).

So excluding IO elevator it's more than 2x degradation.

We need to figure out why this is happening. Is it just slot discrepancy? 
Regardless, this needs to be addressed.

  was:
While most queries run faster in LLAP than just Tez with container reuse, TPCH 
Q1 is much slower.
On my run, tez with container reuse (current default LLAP configuration but 
mode == container and no daemons running)  runs 2-6 (out of 6 consecutive runs 
in the same session) finished in 25.5sec average; with 16 LLAP daemons in 
default config the average was 35.5sec; same w/o IO elevator (to rule out its 
impact) it took 59.7sec w/strange distribution (later runs were slower than 
earlier runs, still, fastest run was 49.5sec).

So excluding IO elevator it's more than 2x degradation.

We need to figure out why this is happening. Is it just slot discrepancy? 
Regardless, this needs to be addressed.


 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, on tez with container reuse (current default LLAP configuration 
 but mode == container and no daemons running)  runs 2-6 (out of 6 consecutive 
 runs in the same session) finished in 25.5sec average; with 16 LLAP daemons 
 in default config the average was 35.5sec; same w/o IO elevator (to rule out 
 its impact) it took 59.7sec w/strange distribution (later runs were slower 
 than earlier runs, still, fastest run was 49.5sec).
 So excluding IO elevator it's more than 2x degradation.
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10451) PTF deserializer fails if values are not used in reducer

2015-04-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10451:

Attachment: HIVE-10451.1.patch

Patch which fixes the typeinfo parsing, since plan is valid and correct.

  PTF deserializer fails if values are not used in reducer 
 --

 Key: HIVE-10451
 URL: https://issues.apache.org/jira/browse/HIVE-10451
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10451.1.patch, HIVE-10451.patch


 In this particular case Select on top of PTF Op is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10451) PTF deserializer fails if values are not used in reducer

2015-04-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10451:

Summary:  PTF deserializer fails if values are not used in reducer   (was: 
IdentityProjectRemover removed useful project)

  PTF deserializer fails if values are not used in reducer 
 --

 Key: HIVE-10451
 URL: https://issues.apache.org/jira/browse/HIVE-10451
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10451.1.patch, HIVE-10451.patch


 In this particular case Select on top of PTF Op is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10036) Writing ORC format big table causes OOM - too many fixed sized stream buffers

2015-04-23 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510096#comment-14510096
 ] 

Owen O'Malley commented on HIVE-10036:
--

I understand the problem, but this patch creates more trouble than it fixes. 
The original design is such that you don't do buffer copies. This patch 
destroys that design and adds both buffer copies and reallocations. We should 
set the buffer sizes down for the bit vectors, but this patch is going the 
wrong way.

 Writing ORC format big table causes OOM - too many fixed sized stream buffers
 -

 Key: HIVE-10036
 URL: https://issues.apache.org/jira/browse/HIVE-10036
 Project: Hive
  Issue Type: Improvement
Reporter: Selina Zhang
Assignee: Selina Zhang
  Labels: orcfile
 Attachments: HIVE-10036.1.patch, HIVE-10036.2.patch, 
 HIVE-10036.3.patch, HIVE-10036.5.patch, HIVE-10036.6.patch


 ORC writer keeps multiple out steams for each column. Each output stream is 
 allocated fixed size ByteBuffer (configurable, default to 256K). For a big 
 table, the memory cost is unbearable. Specially when HCatalog dynamic 
 partition involves, several hundreds files may be open and writing at the 
 same time (same problems for FileSinkOperator). 
 Global ORC memory manager controls the buffer size, but it only got kicked in 
 at 5000 rows interval. An enhancement could be done here, but the problem is 
 reducing the buffer size introduces worse compression and more IOs in read 
 path. Sacrificing the read performance is always not a good choice. 
 I changed the fixed size ByteBuffer to a dynamic growth buffer which up bound 
 to the existing configurable buffer size. Most of the streams does not need 
 large buffer so the performance got improved significantly. Comparing to 
 Facebook's hive-dwrf, I monitored 2x performance gain with this fix. 
 Solving OOM for ORC completely maybe needs lots of effort , but this is 
 definitely a low hanging fruit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10475) LLAP: Minor fixes after tez api enhancements for dag completion

2015-04-23 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10475:
--
Attachment: HIVE-10475.1.txt

 LLAP: Minor fixes after tez api enhancements for dag completion
 ---

 Key: HIVE-10475
 URL: https://issues.apache.org/jira/browse/HIVE-10475
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: llap

 Attachments: HIVE-10475.1.txt


 TEZ-2212 and TEZ-2361  add APIs to propagate dag completion information to 
 the TaskCommunicator plugin. This jira is for minor fixes to get the llap 
 branch to compile against these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10416:
---
Attachment: HIVE-10416.02.patch

[~jpullokkaran], I have changed the patch according to our discussion. Could 
you check it? Thanks

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.01.patch, HIVE-10416.02.patch, 
 HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10431) HIVE-9555 broke hadoop-1 build

2015-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10431:

Attachment: HIVE-10431.patch

This fixes the build issue for me

 HIVE-9555 broke hadoop-1 build
 --

 Key: HIVE-10431
 URL: https://issues.apache.org/jira/browse/HIVE-10431
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Sergey Shelukhin
 Attachments: HIVE-10431.patch


 HIVE-9555 RecordReaderUtils uses direct bytebuffer read from 
 FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 
 compilation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow

2015-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10474:

Description: 
While most queries run faster in LLAP than just Tez with container reuse, TPCH 
Q1 is much slower.
On my run, tez with container reuse (current default LLAP configuration but 
mode == container and no daemons running)  run 2-6 (out of 6) finished in 
25.5sec average; with 16 LLAP daemons in default config it finished in 35.5sec; 
w/the daemons w/o IO elevator (to rule out its impact) it took 59.7sec 
w/strange distribution (later runs were slower than earlier runs, still, 
fastest run was 49.5sec).

We need to figure out why this is happening. Is it just slot discrepancy? 
Regardless, this needs to be addressed.

  was:
While most queries run faster in LLAP than just Tez with container reuse, TPCH 
Q1 is much slower.
On my run, tez with container reuse (current default LLAP configuration but 
mode == container and no daemons running) finished in 25.5sec average; with 16 
LLAP daemons in default config it finished in 35.5sec; w/the daemons w/o IO 
elevator (to rule out its impact) it took 59.7sec w/strange distribution (later 
runs were slower than earlier runs, still, fastest run was 49.5sec).

We need to figure out why this is happening. Is it just slot discrepancy? 
Regardless, this needs to be addressed.


 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, tez with container reuse (current default LLAP configuration but 
 mode == container and no daemons running)  run 2-6 (out of 6) finished in 
 25.5sec average; with 16 LLAP daemons in default config it finished in 
 35.5sec; w/the daemons w/o IO elevator (to rule out its impact) it took 
 59.7sec w/strange distribution (later runs were slower than earlier runs, 
 still, fastest run was 49.5sec).
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow

2015-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10474:

Description: 
While most queries run faster in LLAP than just Tez with container reuse, TPCH 
Q1 is much slower.
On my run, tez with container reuse (current default LLAP configuration but 
mode == container and no daemons running)  runs 2-6 (out of 6 consecutive runs 
in the same session) finished in 25.5sec average; with 16 LLAP daemons in 
default config the average was 35.5sec; same w/o IO elevator (to rule out its 
impact) it took 59.7sec w/strange distribution (later runs were slower than 
earlier runs, still, fastest run was 49.5sec).

So excluding IO elevator it's more than 2x degradation.

We need to figure out why this is happening. Is it just slot discrepancy? 
Regardless, this needs to be addressed.

  was:
While most queries run faster in LLAP than just Tez with container reuse, TPCH 
Q1 is much slower.
On my run, tez with container reuse (current default LLAP configuration but 
mode == container and no daemons running)  run 2-6 (out of 6) finished in 
25.5sec average; with 16 LLAP daemons in default config it finished in 35.5sec; 
w/the daemons w/o IO elevator (to rule out its impact) it took 59.7sec 
w/strange distribution (later runs were slower than earlier runs, still, 
fastest run was 49.5sec).

We need to figure out why this is happening. Is it just slot discrepancy? 
Regardless, this needs to be addressed.


 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, tez with container reuse (current default LLAP configuration but 
 mode == container and no daemons running)  runs 2-6 (out of 6 consecutive 
 runs in the same session) finished in 25.5sec average; with 16 LLAP daemons 
 in default config the average was 35.5sec; same w/o IO elevator (to rule out 
 its impact) it took 59.7sec w/strange distribution (later runs were slower 
 than earlier runs, still, fastest run was 49.5sec).
 So excluding IO elevator it's more than 2x degradation.
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]

2015-04-23 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-10302:
---
Attachment: HIVE-10302.3-spark.patch

 Load small tables (for map join) in executor memory only once [Spark Branch]
 

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.2-spark.patch, HIVE-10302.3-spark.patch, 
 HIVE-10302.spark-1.patch


 Usually there are multiple cores in a Spark executor, and thus it's possible 
 that multiple map-join tasks can be running in the same executor 
 (concurrently or sequentially). Currently, each task will load its own copy 
 of the small tables for map join into memory, ending up with inefficiency. 
 Ideally, we only load the small tables once and share them among the tasks 
 running in that executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]

2015-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510158#comment-14510158
 ] 

Hive QA commented on HIVE-10302:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727750/HIVE-10302.3-spark.patch

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 8721 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did 
not produce a TEST-*.xml file
TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not 
produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/837/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/837/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-837/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727750 - PreCommit-HIVE-SPARK-Build

 Load small tables (for map join) in executor memory only once [Spark Branch]
 

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.2-spark.patch, HIVE-10302.3-spark.patch, 
 HIVE-10302.spark-1.patch


 Usually there are multiple cores in a Spark executor, and thus it's possible 
 that multiple map-join tasks can be running in the same executor 
 (concurrently or sequentially). Currently, each task will load its own copy 
 of the small tables for map join into memory, ending up with inefficiency. 
 Ideally, we only load the small tables once and share them among the tasks 
 running in that executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >