[jira] [Commented] (HIVE-4246) Implement predicate pushdown for ORC
[ https://issues.apache.org/jira/browse/HIVE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731681#comment-13731681 ] Gopal V commented on HIVE-4246: --- The IN() implementation does a linear search on the predicate leaves right now. Since we are only checking range not actual membership, it would be better to store it as a sorted list and perform a bin search. In most cases this will enable a fast path for the list's min/max. But in the corner case we'll get a case where the bin search inserts min max at the same location matches no element, then we can skip the block. Implement predicate pushdown for ORC Key: HIVE-4246 URL: https://issues.apache.org/jira/browse/HIVE-4246 Project: Hive Issue Type: New Feature Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4246.D11415.1.patch By using the push down predicates from the table scan operator, ORC can skip over 10,000 rows at a time that won't satisfy the predicate. This will help a lot, especially if the file is sorted by the column that is used in the predicate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-6.patch HIVE-4531-6.patch resync with trunk. e2e test will be in a follow up Jira. [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-6.patch [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: (was: HIVE-4531-6.patch) [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5017) DBTokenStore gives compiler warnings
[ https://issues.apache.org/jira/browse/HIVE-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5017: - Attachment: HIVE-5017.1.patch DBTokenStore gives compiler warnings Key: HIVE-5017 URL: https://issues.apache.org/jira/browse/HIVE-5017 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-5017.1.patch The Method.invoke call in 2 cases is done via (Object[])null but empty Object array will shut up the compiler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5017) DBTokenStore gives compiler warnings
[ https://issues.apache.org/jira/browse/HIVE-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5017: - Status: Patch Available (was: Open) DBTokenStore gives compiler warnings Key: HIVE-5017 URL: https://issues.apache.org/jira/browse/HIVE-5017 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-5017.1.patch The Method.invoke call in 2 cases is done via (Object[])null but empty Object array will shut up the compiler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5017) DBTokenStore gives compiler warnings
Gunther Hagleitner created HIVE-5017: Summary: DBTokenStore gives compiler warnings Key: HIVE-5017 URL: https://issues.apache.org/jira/browse/HIVE-5017 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-5017.1.patch The Method.invoke call in 2 cases is done via (Object[])null but empty Object array will shut up the compiler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
Benjamin Jakobus created HIVE-5018: -- Summary: Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java java/org/apache/hadoop/hive/ql/lockmgr/EmbeddedLockManager.java java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java java/org/apache/hadoop/hive/ql/metadata/Hive.java java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
[jira] [Commented] (HIVE-5009) Fix minor optimization issues
[ https://issues.apache.org/jira/browse/HIVE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731783#comment-13731783 ] Benjamin Jakobus commented on HIVE-5009: ok, thanks. Fix minor optimization issues - Key: HIVE-5009 URL: https://issues.apache.org/jira/browse/HIVE-5009 Project: Hive Issue Type: Improvement Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Original Estimate: 48h Remaining Estimate: 48h I have found some minor optimization issues in the codebase, which I would like to rectify and contribute. Specifically, these are: The optimizations that could be applied to Hive's code base are as follows: 1. Use StringBuffer when appending strings - In 184 instances, the concatination operator (+=) was used when appending strings. This is inherintly inefficient - instead Java's StringBuffer or StringBuilder class should be used. 12 instances of this optimization can be applied to the GenMRSkewJoinProcessor class and another three to the optimizer. CliDriver uses the + operator inside a loop, so does the column projection utilities class (ColumnProjectionUtils) and the aforementioned skew-join processor. Tests showed that using the StringBuilder when appending strings is 57\% faster than using the + operator (using the StringBuffer took 122 milliseconds whilst the + operator took 284 milliseconds). The reason as to why using the StringBuffer class is preferred over using the + operator, is because String third = first + second; gets compiled to: StringBuilder builder = new StringBuilder( first ); builder.append( second ); third = builder.toString(); Therefore, when building complex strings, that, for example involve loops, require many instantiations (and as discussed below, creating new objects inside loops is inefficient). 2. Use arrays instead of List - Java's java.util.Arrays class asList method is a more efficient at creating creating lists from arrays than using loops to manually iterate over the elements (using asList is computationally very cheap, O(1), as it merely creates a wrapper object around the array; looping through the list however has a complexity of O(n) since a new list is created and every element in the array is added to this new list). As confirmed by the experiment detailed in Appendix D, the Java compiler does not automatically optimize and replace tight-loop copying with asList: the loop-copying of 1,000,000 items took 15 milliseconds whilst using asList is instant. Four instances of this optimization can be applied to Hive's codebase (two of these should be applied to the Map-Join container - MapJoinRowContainer) - lines 92 to 98: for (obj = other.first(); obj != null; obj = other.next()) { ArrayListObject ele = new ArrayList(obj.length); for (int i = 0; i obj.length; i++) { ele.add(obj[i]); } list.add((Row) ele); } 3. Unnecessary wrapper object creation - In 31 cases, wrapper object creation could be avoided by simply using the provided static conversion methods. As noted in the PMD documentation, using these avoids the cost of creating objects that also need to be garbage-collected later. For example, line 587 of the SemanticAnalyzer class, could be replaced by the more efficient parseDouble method call: // Inefficient: Double percent = Double.valueOf(value).doubleValue(); // To be replaced by: Double percent = Double.parseDouble(value); Our test case in Appendix D confirms this: converting 10,000 strings into integers using Integer.parseInt(gen.nextSessionId()) (i.e. creating an unnecessary wrapper object) took 119 on average; using parseInt() took only 38. Therefore creating even just one unnecessary wrapper object can make your code up to 68% slower. 4. Converting literals to strings using + - Converting literals to strings using + is quite inefficient (see Appendix D) and should be done by calling the toString() method instead: converting 1,000,000 integers to strings using + took, on average, 1340 milliseconds whilst using the toString() method only required 1183 milliseconds (hence adding empty strings takes nearly 12% more time). 89 instances of this using + when converting literals were found in Hive's codebase - one of these are found in the JoinUtil. 5. Avoid manual copying of arrays - Instead of copying arrays as is done in GroupByOperator on line 1040 (see below), the more efficient System.arraycopy can be used (arraycopy is a native method meaning that the entire memory block is copied using memcpy or mmove). // Line 1040 of the GroupByOperator for (int i = 0; i keys.length; i++) { forwardCache[i] = keys[i]; }
[jira] [Updated] (HIVE-5009) Fix minor optimization issues
[ https://issues.apache.org/jira/browse/HIVE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5009: --- Description: I have found some minor optimization issues in the codebase, which I would like to rectify and contribute. Specifically, these are: The optimizations that could be applied to Hive's code base are as follows: 1. Use StringBuffer when appending strings - In 184 instances, the concatination operator (+=) was used when appending strings. This is inherintly inefficient - instead Java's StringBuffer or StringBuilder class should be used. 12 instances of this optimization can be applied to the GenMRSkewJoinProcessor class and another three to the optimizer. CliDriver uses the + operator inside a loop, so does the column projection utilities class (ColumnProjectionUtils) and the aforementioned skew-join processor. Tests showed that using the StringBuilder when appending strings is 57\% faster than using the + operator (using the StringBuffer took 122 milliseconds whilst the + operator took 284 milliseconds). The reason as to why using the StringBuffer class is preferred over using the + operator, is because String third = first + second; gets compiled to: StringBuilder builder = new StringBuilder( first ); builder.append( second ); third = builder.toString(); Therefore, when building complex strings, that, for example involve loops, require many instantiations (and as discussed below, creating new objects inside loops is inefficient). 2. Use arrays instead of List - Java's java.util.Arrays class asList method is a more efficient at creating creating lists from arrays than using loops to manually iterate over the elements (using asList is computationally very cheap, O(1), as it merely creates a wrapper object around the array; looping through the list however has a complexity of O(n) since a new list is created and every element in the array is added to this new list). As confirmed by the experiment detailed in Appendix D, the Java compiler does not automatically optimize and replace tight-loop copying with asList: the loop-copying of 1,000,000 items took 15 milliseconds whilst using asList is instant. Four instances of this optimization can be applied to Hive's codebase (two of these should be applied to the Map-Join container - MapJoinRowContainer) - lines 92 to 98: for (obj = other.first(); obj != null; obj = other.next()) { ArrayListObject ele = new ArrayList(obj.length); for (int i = 0; i obj.length; i++) { ele.add(obj[i]); } list.add((Row) ele); } 3. Unnecessary wrapper object creation - In 31 cases, wrapper object creation could be avoided by simply using the provided static conversion methods. As noted in the PMD documentation, using these avoids the cost of creating objects that also need to be garbage-collected later. For example, line 587 of the SemanticAnalyzer class, could be replaced by the more efficient parseDouble method call: // Inefficient: Double percent = Double.valueOf(value).doubleValue(); // To be replaced by: Double percent = Double.parseDouble(value); Our test case in Appendix D confirms this: converting 10,000 strings into integers using Integer.parseInt(gen.nextSessionId()) (i.e. creating an unnecessary wrapper object) took 119 on average; using parseInt() took only 38. Therefore creating even just one unnecessary wrapper object can make your code up to 68% slower. 4. Converting literals to strings using + - Converting literals to strings using + is quite inefficient (see Appendix D) and should be done by calling the toString() method instead: converting 1,000,000 integers to strings using + took, on average, 1340 milliseconds whilst using the toString() method only required 1183 milliseconds (hence adding empty strings takes nearly 12% more time). 89 instances of this using + when converting literals were found in Hive's codebase - one of these are found in the JoinUtil. 5. Avoid manual copying of arrays - Instead of copying arrays as is done in GroupByOperator on line 1040 (see below), the more efficient System.arraycopy can be used (arraycopy is a native method meaning that the entire memory block is copied using memcpy or mmove). // Line 1040 of the GroupByOperator for (int i = 0; i keys.length; i++) { forwardCache[i] = keys[i]; } Using System.arraycopy on an array of 10,000 strings was (close to) instant whilst the manual copy took 6 milliseconds. 11 instances of this optimization should be applied to the Hive codebase. 6. Avoiding instantiation inside loops - As noted in the PMD documentation, new objects created within loops should be checked to see if they can created outside them and reused.. Declaring variables inside a loop (i from 0 to 10,000) took 300 milliseconds whilst declaring them outside took only 88 milliseconds (this can be
[jira] [Created] (HIVE-5019) Use StringBuffer instead of += (issue 1)
Benjamin Jakobus created HIVE-5019: -- Summary: Use StringBuffer instead of += (issue 1) Key: HIVE-5019 URL: https://issues.apache.org/jira/browse/HIVE-5019 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Issue 1 (use of StringBuffer over +=) java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java java/org/apache/hadoop/hive/ql/lib/RuleExactMatch.java java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java java/org/apache/hadoop/hive/ql/metadata/Partition.java java/org/apache/hadoop/hive/ql/metadata/Table.java java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
[jira] [Commented] (HIVE-4123) The RLE encoding for ORC can be improved
[ https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731794#comment-13731794 ] Prasanth J commented on HIVE-4123: -- Code comment improvement/fixes, removed some redundant code, long repeat runs will directly use DELTA encoding instead of calling determineEncoding() function and few more changes added. The RLE encoding for ORC can be improved Key: HIVE-4123 URL: https://issues.apache.org/jira/browse/HIVE-4123 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Prasanth J Labels: orcfile Fix For: 0.12.0 Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt, HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, HIVE-4123.5.txt, HIVE-4123.6.txt, ORC-Compression-Ratio-Comparison.xlsx The run length encoding of integers can be improved: * tighter bit packing * allow delta encoding * allow longer runs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4123) The RLE encoding for ORC can be improved
[ https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-4123: - Attachment: HIVE-4123.6.txt The RLE encoding for ORC can be improved Key: HIVE-4123 URL: https://issues.apache.org/jira/browse/HIVE-4123 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Prasanth J Labels: orcfile Fix For: 0.12.0 Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt, HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, HIVE-4123.5.txt, HIVE-4123.6.txt, ORC-Compression-Ratio-Comparison.xlsx The run length encoding of integers can be improved: * tighter bit packing * allow delta encoding * allow longer runs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5009) Fix minor optimization issues
[ https://issues.apache.org/jira/browse/HIVE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5009: --- Attachment: AbstractBucketJoinProc.java Fix minor optimization issues - Key: HIVE-5009 URL: https://issues.apache.org/jira/browse/HIVE-5009 Project: Hive Issue Type: Improvement Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractBucketJoinProc.java Original Estimate: 48h Remaining Estimate: 48h I have found some minor optimization issues in the codebase, which I would like to rectify and contribute. Specifically, these are: The optimizations that could be applied to Hive's code base are as follows: 1. Use StringBuffer when appending strings - In 184 instances, the concatination operator (+=) was used when appending strings. This is inherintly inefficient - instead Java's StringBuffer or StringBuilder class should be used. 12 instances of this optimization can be applied to the GenMRSkewJoinProcessor class and another three to the optimizer. CliDriver uses the + operator inside a loop, so does the column projection utilities class (ColumnProjectionUtils) and the aforementioned skew-join processor. Tests showed that using the StringBuilder when appending strings is 57\% faster than using the + operator (using the StringBuffer took 122 milliseconds whilst the + operator took 284 milliseconds). The reason as to why using the StringBuffer class is preferred over using the + operator, is because String third = first + second; gets compiled to: StringBuilder builder = new StringBuilder( first ); builder.append( second ); third = builder.toString(); Therefore, when building complex strings, that, for example involve loops, require many instantiations (and as discussed below, creating new objects inside loops is inefficient). 2. Use arrays instead of List - Java's java.util.Arrays class asList method is a more efficient at creating creating lists from arrays than using loops to manually iterate over the elements (using asList is computationally very cheap, O(1), as it merely creates a wrapper object around the array; looping through the list however has a complexity of O(n) since a new list is created and every element in the array is added to this new list). As confirmed by the experiment detailed in Appendix D, the Java compiler does not automatically optimize and replace tight-loop copying with asList: the loop-copying of 1,000,000 items took 15 milliseconds whilst using asList is instant. Four instances of this optimization can be applied to Hive's codebase (two of these should be applied to the Map-Join container - MapJoinRowContainer) - lines 92 to 98: for (obj = other.first(); obj != null; obj = other.next()) { ArrayListObject ele = new ArrayList(obj.length); for (int i = 0; i obj.length; i++) { ele.add(obj[i]); } list.add((Row) ele); } 3. Unnecessary wrapper object creation - In 31 cases, wrapper object creation could be avoided by simply using the provided static conversion methods. As noted in the PMD documentation, using these avoids the cost of creating objects that also need to be garbage-collected later. For example, line 587 of the SemanticAnalyzer class, could be replaced by the more efficient parseDouble method call: // Inefficient: Double percent = Double.valueOf(value).doubleValue(); // To be replaced by: Double percent = Double.parseDouble(value); Our test case in Appendix D confirms this: converting 10,000 strings into integers using Integer.parseInt(gen.nextSessionId()) (i.e. creating an unnecessary wrapper object) took 119 on average; using parseInt() took only 38. Therefore creating even just one unnecessary wrapper object can make your code up to 68% slower. 4. Converting literals to strings using + - Converting literals to strings using + is quite inefficient (see Appendix D) and should be done by calling the toString() method instead: converting 1,000,000 integers to strings using + took, on average, 1340 milliseconds whilst using the toString() method only required 1183 milliseconds (hence adding empty strings takes nearly 12% more time). 89 instances of this using + when converting literals were found in Hive's codebase - one of these are found in the JoinUtil. 5. Avoid manual copying of arrays - Instead of copying arrays as is done in GroupByOperator on line 1040 (see below), the more efficient System.arraycopy can be used (arraycopy is a native method meaning that the entire memory block is copied using memcpy or mmove). // Line 1040 of the GroupByOperator for (int i = 0; i keys.length; i++) { forwardCache[i]
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: AbstractGenericUDFEWAHBitmapBop.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java java/org/apache/hadoop/hive/ql/lockmgr/EmbeddedLockManager.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: BaseSemanticAnalyzer.java AbstractSMBJoinProc.java AbstractJoinTaskDispatcher.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: (was: BaseSemanticAnalyzer.java) Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: BaseSemanticAnalyzer.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java
[jira] [Commented] (HIVE-3363) Special characters (such as 'é') displayed as '?' in Hive
[ https://issues.apache.org/jira/browse/HIVE-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731825#comment-13731825 ] Kousuke Saruta commented on HIVE-3363: -- I think, this problem may be similar to HIVE-2137. In SQLOperation.java, getNextRowSet() have a bunch of code as follows, {code} for (String rowString : rows) { rowObj = serde.deserialize(new BytesWritable(rowString.getBytes())); for (int i = 0; i fieldRefs.size(); i++) { StructField fieldRef = fieldRefs.get(i); fieldOI = fieldRef.getFieldObjectInspector(); deserializedFields[i] = convertLazyToJava(soi.getStructFieldData(rowObj, fieldRef), fieldOI); } rowSet.addRow(resultSchema, deserializedFields); } {code} The code above use getBytes() without setting encoding so it will use system default encoding. If the front end of hive is used in Windows, encoding mismatch will happen because Hive(Hadoop) expects UTF-8 for their character encoding but Windows use Shift_JIS. So, I think the code above should be as follows {code} for (String rowString : rows) { rowObj = serde.deserialize(new BytesWritable(rowString.getBytes(UTF-8))); for (int i = 0; i fieldRefs.size(); i++) { StructField fieldRef = fieldRefs.get(i); fieldOI = fieldRef.getFieldObjectInspector(); deserializedFields[i] = convertLazyToJava(soi.getStructFieldData(rowObj, fieldRef), fieldOI); } rowSet.addRow(resultSchema, deserializedFields); } {code} Special characters (such as 'é') displayed as '?' in Hive - Key: HIVE-3363 URL: https://issues.apache.org/jira/browse/HIVE-3363 Project: Hive Issue Type: Bug Reporter: Anand Balaraman I am facing an issue while viewing special characters (such as é) using Hive. If I view the file in HDFS (using hadoop fs -cat command), it is displayed correctly as ’é’, but when I select the data using Hive, this character alone gets replaced by a question mark. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: BlockMergeTask.java BitmapIndexHandler.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java
[jira] [Updated] (HIVE-3363) Special characters (such as 'é') displayed as '?' in Hive
[ https://issues.apache.org/jira/browse/HIVE-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-3363: - Attachment: HIVE-3363.patch Initial patch. Special characters (such as 'é') displayed as '?' in Hive - Key: HIVE-3363 URL: https://issues.apache.org/jira/browse/HIVE-3363 Project: Hive Issue Type: Bug Reporter: Anand Balaraman Attachments: HIVE-3363.patch I am facing an issue while viewing special characters (such as é) using Hive. If I view the file in HDFS (using hadoop fs -cat command), it is displayed correctly as ’é’, but when I select the data using Hive, this character alone gets replaced by a question mark. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: BucketingSortingOpProcFactory.java BucketingSortingInferenceOptimizer.java BucketMapJoinContext.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketMapJoinContext.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: BucketingSortingReduceSinkOptimizer.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketMapJoinContext.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: BucketizedHiveInputFormat.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: ColumnPrunerProcFactory.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: ColumnPrunerProcFactory.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnPrunerProcFactory.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
[jira] [Commented] (HIVE-4233) The TGT gotten from class 'CLIService' should be renewed on time
[ https://issues.apache.org/jira/browse/HIVE-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731853#comment-13731853 ] Hive QA commented on HIVE-4233: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596469/HIVE-4233.5.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2767 tests executed *Failed tests:* {noformat} org.apache.hcatalog.pig.TestHCatStorerMulti.testStorePartitionedTable {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/327/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/327/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. The TGT gotten from class 'CLIService' should be renewed on time - Key: HIVE-4233 URL: https://issues.apache.org/jira/browse/HIVE-4233 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Environment: CentOS release 6.3 (Final) jdk1.6.0_31 HiveServer2 0.10.0-cdh4.2.0 Kerberos Security Reporter: Dongyong Wang Assignee: Thejas M Nair Priority: Critical Attachments: 0001-FIX-HIVE-4233.patch, HIVE-4233-2.patch, HIVE-4233-3.patch, HIVE-4233.4.patch, HIVE-4233.5.patch When the HIveServer2 have started more than 7 days, I use beeline shell to connect the HiveServer2,all operation failed. The log of HiveServer2 shows it was caused by the Kerberos auth failure,the exception stack trace is: 2013-03-26 11:55:20,932 ERROR hive.ql.metadata.Hive: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1084) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:51) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2140) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2151) at org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2275) at org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:358) at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:127) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1073) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1058) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor52.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1082) ... 16 more Caused by: java.lang.IllegalStateException: This ticket is no longer valid at javax.security.auth.kerberos.KerberosTicket.toString(KerberosTicket.java:601) at java.lang.String.valueOf(String.java:2826) at java.lang.StringBuilder.append(StringBuilder.java:115) at sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:120) at sun.security.jgss.krb5.SubjectComber.find(SubjectComber.java:41) at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:130) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:328) at java.security.AccessController.doPrivileged(Native Method) at
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: ColumnStatsTask.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: CombineHiveInputFormat.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: CommonJoinOperator.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: (was: ColumnPrunerProcFactory.java) Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: ConditionalResolverCommonJoin.java CommonJoinTaskDispatcher.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java, CommonJoinTaskDispatcher.java, ConditionalResolverCommonJoin.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java java/org/apache/hadoop/hive/ql/io/orc/DynamicIntArray.java java/org/apache/hadoop/hive/ql/io/orc/FileDump.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: DDLSemanticAnalyzer.java CorrelationOptimizer.java Context.java ConditionalResolverSkewJoin.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java, CommonJoinTaskDispatcher.java, ConditionalResolverCommonJoin.java, ConditionalResolverSkewJoin.java, Context.java, CorrelationOptimizer.java, DDLSemanticAnalyzer.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java
[jira] [Updated] (HIVE-3363) Special characters (such as 'é') displayed as '?' in Hive
[ https://issues.apache.org/jira/browse/HIVE-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-3363: - Affects Version/s: 0.12.0 Status: Patch Available (was: Open) Special characters (such as 'é') displayed as '?' in Hive - Key: HIVE-3363 URL: https://issues.apache.org/jira/browse/HIVE-3363 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anand Balaraman Attachments: HIVE-3363.patch I am facing an issue while viewing special characters (such as é) using Hive. If I view the file in HDFS (using hadoop fs -cat command), it is displayed correctly as ’é’, but when I select the data using Hive, this character alone gets replaced by a question mark. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5017) DBTokenStore gives compiler warnings
[ https://issues.apache.org/jira/browse/HIVE-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731883#comment-13731883 ] Hive QA commented on HIVE-5017: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596531/HIVE-5017.1.patch {color:green}SUCCESS:{color} +1 2767 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/329/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/329/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. DBTokenStore gives compiler warnings Key: HIVE-5017 URL: https://issues.apache.org/jira/browse/HIVE-5017 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-5017.1.patch The Method.invoke call in 2 cases is done via (Object[])null but empty Object array will shut up the compiler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3363) Special characters (such as 'é') displayed as '?' in Hive
[ https://issues.apache.org/jira/browse/HIVE-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731917#comment-13731917 ] Hive QA commented on HIVE-3363: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596565/HIVE-3363.patch {color:green}SUCCESS:{color} +1 2767 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/331/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/331/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Special characters (such as 'é') displayed as '?' in Hive - Key: HIVE-3363 URL: https://issues.apache.org/jira/browse/HIVE-3363 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anand Balaraman Attachments: HIVE-3363.patch I am facing an issue while viewing special characters (such as é) using Hive. If I view the file in HDFS (using hadoop fs -cat command), it is displayed correctly as ’é’, but when I select the data using Hive, this character alone gets replaced by a question mark. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732024#comment-13732024 ] Brock Noland commented on HIVE-5018: [~benjamin.jakobus] I really appreciate the work you are doing! However, we'll need you to submit the changes in a slightly different form. The way this project works is that people submit patches or diffs which contain only the changes. This avoids issues of change integration where two people are working the same file at the same time. Here is a quick introduction on how to create a patch: {noformat} $ git clone https://github.com/apache/hive.git $ cd hive $ vim build.properties (make some trivial change) $ git diff /tmp/my-jira.patch {noformat} For example over on HIVE-3363 there is a file HIVE-3363.patch which was generated in such a manner. Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java, CommonJoinTaskDispatcher.java, ConditionalResolverCommonJoin.java, ConditionalResolverSkewJoin.java, Context.java, CorrelationOptimizer.java, DDLSemanticAnalyzer.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
[jira] [Commented] (HIVE-4948) WriteLockTest and ZNodeNameTest do not follow test naming pattern
[ https://issues.apache.org/jira/browse/HIVE-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732038#comment-13732038 ] Hudson commented on HIVE-4948: -- FAILURE: Integrated in Hive-trunk-h0.21 #2249 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2249/]) HIVE-4948: WriteLockTest and ZNodeNameTest do not follow test naming pattern (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1511075) * /hive/trunk/hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/snapshot/lock/TestWriteLock.java * /hive/trunk/hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/snapshot/lock/TestZNodeName.java * /hive/trunk/hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/snapshot/lock/WriteLockTest.java * /hive/trunk/hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/snapshot/lock/ZNodeNameTest.java WriteLockTest and ZNodeNameTest do not follow test naming pattern - Key: HIVE-4948 URL: https://issues.apache.org/jira/browse/HIVE-4948 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4948.patch These tests should be renamed TestWriteLock and TestZNodeName org.apache.hcatalog.hbase.snapshot.lock.WriteLockTest org.apache.hcatalog.hbase.snapshot.lock.ZNodeNameTest -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732039#comment-13732039 ] Hudson commented on HIVE-4870: -- FAILURE: Integrated in Hive-trunk-h0.21 #2249 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2249/]) HIVE-4870 : Explain Extended to show partition info for Fetch Task (Laljo John Pullokkaran via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1511066) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_1.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_2.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_3.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_4.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_7.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_8.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin10.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin11.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin12.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin13.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin7.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin8.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin9.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out * /hive/trunk/ql/src/test/results/clientpositive/join32.q.out * /hive/trunk/ql/src/test/results/clientpositive/join32_lessSize.q.out * /hive/trunk/ql/src/test/results/clientpositive/join33.q.out * /hive/trunk/ql/src/test/results/clientpositive/sort_merge_join_desc_6.q.out * /hive/trunk/ql/src/test/results/clientpositive/sort_merge_join_desc_7.q.out * /hive/trunk/ql/src/test/results/clientpositive/stats11.q.out * /hive/trunk/ql/src/test/results/clientpositive/union22.q.out Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.12.0 Attachments: HIVE-4870.patch Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4995) select * may incorrectly return empty fields with hbase-handler
[ https://issues.apache.org/jira/browse/HIVE-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732037#comment-13732037 ] Hudson commented on HIVE-4995: -- FAILURE: Integrated in Hive-trunk-h0.21 #2249 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2249/]) HIVE-4995: select * may incorrectly return empty fields with hbase-handler (Swarnim Kulkarni via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510973) * /hive/trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java * /hive/trunk/hbase-handler/src/test/queries/positive/hbase_binary_map_queries_prefix.q * /hive/trunk/hbase-handler/src/test/results/positive/hbase_binary_map_queries_prefix.q.out select * may incorrectly return empty fields with hbase-handler --- Key: HIVE-4995 URL: https://issues.apache.org/jira/browse/HIVE-4995 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Fix For: 0.12.0 Attachments: HIVE-4995.1.patch.txt, HIVE-4995.1.patch.txt HIVE-3725 added capability to pull hbase columns with prefixes. However the way the current logic to add columns stands in HiveHBaseTableInput format, it might cause some columns to incorrectly display empty fields. Consider the following query: {noformat} CREATE EXTERNAL TABLE test_table(key string, value1 mapstring,string, value2 string) ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf-a:prefix.*,cf-a:another_col) TBLPROPERTIES (hbase.table.name = test_table); {noformat} Given the existing logic in HiveHBaseTableInputFormat: {code} for (int i = 0; i columnsMapping.size(); i++) { ColumnMapping colMap = columnsMapping.get(i); if (colMap.hbaseRowKey) { continue; } if (colMap.qualifierName == null) { scan.addFamily(colMap.familyNameBytes); } else { scan.addColumn(colMap.familyNameBytes, colMap.qualifierNameBytes); } } {code} So for the above query, the 'addFamily' will be called first followed by 'addColumn' for the column family cf-a. This will wipe away whatever we had set with the 'addFamily' call in the previous step resulting in an empty column when queried. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5011) Dynamic partitioning in HCatalog broken on external tables
[ https://issues.apache.org/jira/browse/HIVE-5011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5011: --- Attachment: HIVE-5011.patch Attaching patch. Dynamic partitioning in HCatalog broken on external tables -- Key: HIVE-5011 URL: https://issues.apache.org/jira/browse/HIVE-5011 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5011.patch Dynamic partitioning with HCatalog has been broken as a result of HCATALOG-500 trying to support user-set paths for external tables. The goal there was to be able to support other custom destinations apart from the normal hive-style partitions. However, it is not currently possible for users to set paths for dynamic ptn writes, since we don't support any way for users to specify patterns(like, say $\{rootdir\}/$v1.$v2/) into which writes happen, only locations, and the values for dyn. partitions are not known ahead of time. Also, specifying a custom path messes with the way dynamic ptn. code tries to determine what was written to where from the output committer, which means that even if we supported patterned-writes instead of location-writes, we still have to do some more deep diving into the output committer code to support it. Thus, my current proposal is that we honour writes to user-specified paths for external tables *ONLY* for static partition writes - i.e., if we can determine that the write is a dyn. ptn. write, we will ignore the user specification. (Note that this does not mean we ignore the table's external location - we honour that - we just don't honour any HCatStorer/etc provided additional location - we stick to what metadata tells us the root location is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport
[ https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732118#comment-13732118 ] Ashutosh Chauhan commented on HIVE-4911: +1 LGTM Enable QOP configuration for Hive Server 2 thrift transport --- Key: HIVE-4911 URL: https://issues.apache.org/jira/browse/HIVE-4911 Project: Hive Issue Type: New Feature Reporter: Arup Malakar Assignee: Arup Malakar Attachments: 20-build-temp-change-1.patch, 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, HIVE-4911-trunk-2.patch, HIVE-4911-trunk-3.patch The QoP for hive server 2 should be configurable to enable encryption. A new configuration should be exposed hive.server2.thrift.rpc.protection. This would give greater control configuring hive server 2 service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4992) add ability to skip javadoc during build
[ https://issues.apache.org/jira/browse/HIVE-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-4992. Resolution: Fixed Fix Version/s: 0.12.0 Committed to trunk. Thanks, Sergey! add ability to skip javadoc during build Key: HIVE-4992 URL: https://issues.apache.org/jira/browse/HIVE-4992 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4992.D11967.1.patch, HIVE-4992.D11967.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4987) Javadoc can generate argument list too long error
[ https://issues.apache.org/jira/browse/HIVE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4987: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Brock! Javadoc can generate argument list too long error - Key: HIVE-4987 URL: https://issues.apache.org/jira/browse/HIVE-4987 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4987.patch We just to add to useexternalfile=yes to the javadoc statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4967) Don't serialize unnecessary fields in query plan
[ https://issues.apache.org/jira/browse/HIVE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4967: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Brock for review! Don't serialize unnecessary fields in query plan Key: HIVE-4967 URL: https://issues.apache.org/jira/browse/HIVE-4967 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.12.0 Attachments: HIVE-4967.1.patch, HIVE-4967.patch There are quite a few fields which need not to be serialized since they are initialized anyways in backend. We need not to serialize them in our plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5011) Dynamic partitioning in HCatalog broken on external tables
[ https://issues.apache.org/jira/browse/HIVE-5011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5011: --- Status: Patch Available (was: Open) Dynamic partitioning in HCatalog broken on external tables -- Key: HIVE-5011 URL: https://issues.apache.org/jira/browse/HIVE-5011 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Attachments: HIVE-5011.patch Dynamic partitioning with HCatalog has been broken as a result of HCATALOG-500 trying to support user-set paths for external tables. The goal there was to be able to support other custom destinations apart from the normal hive-style partitions. However, it is not currently possible for users to set paths for dynamic ptn writes, since we don't support any way for users to specify patterns(like, say $\{rootdir\}/$v1.$v2/) into which writes happen, only locations, and the values for dyn. partitions are not known ahead of time. Also, specifying a custom path messes with the way dynamic ptn. code tries to determine what was written to where from the output committer, which means that even if we supported patterned-writes instead of location-writes, we still have to do some more deep diving into the output committer code to support it. Thus, my current proposal is that we honour writes to user-specified paths for external tables *ONLY* for static partition writes - i.e., if we can determine that the write is a dyn. ptn. write, we will ignore the user specification. (Note that this does not mean we ignore the table's external location - we honour that - we just don't honour any HCatStorer/etc provided additional location - we stick to what metadata tells us the root location is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5011) Dynamic partitioning in HCatalog broken on external tables
[ https://issues.apache.org/jira/browse/HIVE-5011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732144#comment-13732144 ] Sushanth Sowmyan commented on HIVE-5011: RB link : https://reviews.facebook.net/D12039 Dynamic partitioning in HCatalog broken on external tables -- Key: HIVE-5011 URL: https://issues.apache.org/jira/browse/HIVE-5011 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5011.patch Dynamic partitioning with HCatalog has been broken as a result of HCATALOG-500 trying to support user-set paths for external tables. The goal there was to be able to support other custom destinations apart from the normal hive-style partitions. However, it is not currently possible for users to set paths for dynamic ptn writes, since we don't support any way for users to specify patterns(like, say $\{rootdir\}/$v1.$v2/) into which writes happen, only locations, and the values for dyn. partitions are not known ahead of time. Also, specifying a custom path messes with the way dynamic ptn. code tries to determine what was written to where from the output committer, which means that even if we supported patterned-writes instead of location-writes, we still have to do some more deep diving into the output committer code to support it. Thus, my current proposal is that we honour writes to user-specified paths for external tables *ONLY* for static partition writes - i.e., if we can determine that the write is a dyn. ptn. write, we will ignore the user specification. (Note that this does not mean we ignore the table's external location - we honour that - we just don't honour any HCatStorer/etc provided additional location - we stick to what metadata tells us the root location is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5011) Dynamic partitioning in HCatalog broken on external tables
[ https://issues.apache.org/jira/browse/HIVE-5011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5011: --- Priority: Critical (was: Major) Dynamic partitioning in HCatalog broken on external tables -- Key: HIVE-5011 URL: https://issues.apache.org/jira/browse/HIVE-5011 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Attachments: HIVE-5011.patch Dynamic partitioning with HCatalog has been broken as a result of HCATALOG-500 trying to support user-set paths for external tables. The goal there was to be able to support other custom destinations apart from the normal hive-style partitions. However, it is not currently possible for users to set paths for dynamic ptn writes, since we don't support any way for users to specify patterns(like, say $\{rootdir\}/$v1.$v2/) into which writes happen, only locations, and the values for dyn. partitions are not known ahead of time. Also, specifying a custom path messes with the way dynamic ptn. code tries to determine what was written to where from the output committer, which means that even if we supported patterned-writes instead of location-writes, we still have to do some more deep diving into the output committer code to support it. Thus, my current proposal is that we honour writes to user-specified paths for external tables *ONLY* for static partition writes - i.e., if we can determine that the write is a dyn. ptn. write, we will ignore the user specification. (Note that this does not mean we ignore the table's external location - we honour that - we just don't honour any HCatStorer/etc provided additional location - we stick to what metadata tells us the root location is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4967) Don't serialize unnecessary fields in query plan
[ https://issues.apache.org/jira/browse/HIVE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732150#comment-13732150 ] Hudson commented on HIVE-4967: -- FAILURE: Integrated in Hive-trunk-hadoop2 #338 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/338/]) HIVE-4967 : Don't serialize unnecessary fields in query plan (Ashutosh Chauhan. Reviewed by Brock Noland) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1511377) * /hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/genericudf/example/GenericUDFDBOutput.java * /hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udtf/example/GenericUDTFExplode2.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeGenericFuncDesc.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDFEWAHBitmapBop.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDFReflect.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectSet.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFContextNGrams.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCorrelation.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCovariance.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFEWAHBitmap.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFHistogramNumeric.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLeadLag.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFNTile.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArray.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayContains.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseCompare.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCase.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCoalesce.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEWAHBitmapEmpty.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFElt.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFField.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFormatNumber.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUtcTimestamp.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFHash.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIf.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIn.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInFile.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInstr.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLocate.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMap.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMapKeys.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMapValues.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFNvl.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPAnd.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNot.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPOr.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPrintf.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect2.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSize.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java *
[jira] [Commented] (HIVE-4992) add ability to skip javadoc during build
[ https://issues.apache.org/jira/browse/HIVE-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732149#comment-13732149 ] Hudson commented on HIVE-4992: -- FAILURE: Integrated in Hive-trunk-hadoop2 #338 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/338/]) HIVE-4992 : add ability to skip javadoc during build (Sergey Shelukhin via Ashutosh h Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1511374) * /hive/trunk/build.xml * /hive/trunk/hcatalog/build.xml add ability to skip javadoc during build Key: HIVE-4992 URL: https://issues.apache.org/jira/browse/HIVE-4992 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4992.D11967.1.patch, HIVE-4992.D11967.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4987) Javadoc can generate argument list too long error
[ https://issues.apache.org/jira/browse/HIVE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732151#comment-13732151 ] Hudson commented on HIVE-4987: -- FAILURE: Integrated in Hive-trunk-hadoop2 #338 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/338/]) HIVE-4987 : Javadoc can generate argument list too long error (Brock Noland via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1511375) * /hive/trunk/build.xml * /hive/trunk/hcatalog/webhcat/svr/build.xml Javadoc can generate argument list too long error - Key: HIVE-4987 URL: https://issues.apache.org/jira/browse/HIVE-4987 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4987.patch We just to add to useexternalfile=yes to the javadoc statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: DemuxOperator.java DDLTask.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java, CommonJoinTaskDispatcher.java, ConditionalResolverCommonJoin.java, ConditionalResolverSkewJoin.java, Context.java, CorrelationOptimizer.java, DDLSemanticAnalyzer.java, DDLTask.java, DemuxOperator.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: Driver.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java, CommonJoinTaskDispatcher.java, ConditionalResolverCommonJoin.java, ConditionalResolverSkewJoin.java, Context.java, CorrelationOptimizer.java, DDLSemanticAnalyzer.java, DDLTask.java, DemuxOperator.java, Driver.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java java/org/apache/hadoop/hive/ql/io/SequenceFileInputFormatChecker.java java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java java/org/apache/hadoop/hive/ql/io/orc/DynamicByteArray.java
[jira] [Commented] (HIVE-4233) The TGT gotten from class 'CLIService' should be renewed on time
[ https://issues.apache.org/jira/browse/HIVE-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732186#comment-13732186 ] Gunther Hagleitner commented on HIVE-4233: -- The test failure is unrelated. Tests look good for this patch. The TGT gotten from class 'CLIService' should be renewed on time - Key: HIVE-4233 URL: https://issues.apache.org/jira/browse/HIVE-4233 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Environment: CentOS release 6.3 (Final) jdk1.6.0_31 HiveServer2 0.10.0-cdh4.2.0 Kerberos Security Reporter: Dongyong Wang Assignee: Thejas M Nair Priority: Critical Attachments: 0001-FIX-HIVE-4233.patch, HIVE-4233-2.patch, HIVE-4233-3.patch, HIVE-4233.4.patch, HIVE-4233.5.patch When the HIveServer2 have started more than 7 days, I use beeline shell to connect the HiveServer2,all operation failed. The log of HiveServer2 shows it was caused by the Kerberos auth failure,the exception stack trace is: 2013-03-26 11:55:20,932 ERROR hive.ql.metadata.Hive: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1084) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:51) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2140) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2151) at org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2275) at org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:358) at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:127) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1073) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1058) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor52.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1082) ... 16 more Caused by: java.lang.IllegalStateException: This ticket is no longer valid at javax.security.auth.kerberos.KerberosTicket.toString(KerberosTicket.java:601) at java.lang.String.valueOf(String.java:2826) at java.lang.StringBuilder.append(StringBuilder.java:115) at sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:120) at sun.security.jgss.krb5.SubjectComber.find(SubjectComber.java:41) at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:130) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:328) at java.security.AccessController.doPrivileged(Native Method) at sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:325) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:128) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:106) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:172) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:209) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:195) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175) at
[jira] [Updated] (HIVE-4586) [HCatalog] WebHCat should return 404 error for undefined resource
[ https://issues.apache.org/jira/browse/HIVE-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4586: - Attachment: HIVE-4586-2.patch HIVE-4586-2.patch resync with trunk and fix unit test failure. [HCatalog] WebHCat should return 404 error for undefined resource - Key: HIVE-4586 URL: https://issues.apache.org/jira/browse/HIVE-4586 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4586-1.patch, HIVE-4586-2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1511) Hive plan serialization is slow
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-1511: --- Attachment: HIVE-1511-wip2.patch Another checkpoint. Hive plan serialization is slow --- Key: HIVE-1511 URL: https://issues.apache.org/jira/browse/HIVE-1511 Project: Hive Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Ning Zhang Attachments: HIVE-1511.patch, HIVE-1511-wip2.patch, HIVE-1511-wip.patch As reported by Edward Capriolo: For reference I did this as a test case SELECT * FROM src where key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR ...(100 more of these) No OOM but I gave up after the test case did not go anywhere for about 2 minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4893) [WebHCat] HTTP 500 errors should be mapped to 400 for bad request
[ https://issues.apache.org/jira/browse/HIVE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved HIVE-4893. -- Resolution: Duplicate [WebHCat] HTTP 500 errors should be mapped to 400 for bad request - Key: HIVE-4893 URL: https://issues.apache.org/jira/browse/HIVE-4893 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4893-1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: (was: HIVE-4531-6.patch) [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-6.patch [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732233#comment-13732233 ] Benjamin Jakobus commented on HIVE-5018: Ah, ok. Really sorry - only saw this message now! Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java, CommonJoinTaskDispatcher.java, ConditionalResolverCommonJoin.java, ConditionalResolverSkewJoin.java, Context.java, CorrelationOptimizer.java, DDLSemanticAnalyzer.java, DDLTask.java, DemuxOperator.java, Driver.java, EmbeddedLockManager.java, ExecDriver.java, ExecReducer.java, EximUtil.java, ExplainTask.java, ExportSemanticAnalyzer.java, FileDump.java, FileSinkOperator.java, FunctionRegistry.java, GenMRFileSink1.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java java/org/apache/hadoop/hive/ql/io/RCFile.java
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: GenMRFileSink1.java FunctionRegistry.java FileSinkOperator.java FileDump.java ExportSemanticAnalyzer.java ExplainTask.java EximUtil.java ExecReducer.java ExecDriver.java EmbeddedLockManager.java Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java, CommonJoinTaskDispatcher.java, ConditionalResolverCommonJoin.java, ConditionalResolverSkewJoin.java, Context.java, CorrelationOptimizer.java, DDLSemanticAnalyzer.java, DDLTask.java, DemuxOperator.java, Driver.java, EmbeddedLockManager.java, ExecDriver.java, ExecReducer.java, EximUtil.java, ExplainTask.java, ExportSemanticAnalyzer.java, FileDump.java, FileSinkOperator.java, FunctionRegistry.java, GenMRFileSink1.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
[jira] [Updated] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold
[ https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4324: -- Attachment: HIVE-4324.D12045.1.patch omalley requested code review of HIVE-4324 [jira] ORC Turn off dictionary encoding when number of distinct keys is greater than threshold. Reviewers: JIRA forward port of kevin's patch Add a configurable threshold so that if the number of distinct values in a string column is greater than that fraction of non-null values, dictionary encoding is turned off. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D12045 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OutStream.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringRedBlackTree.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestFileDump.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java ql/src/test/queries/clientpositive/orc_dictionary_threshold.q ql/src/test/resources/orc-file-dump-dictionary-threshold.out ql/src/test/results/clientpositive/orc_dictionary_threshold.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/28797/ To: JIRA, omalley ORC Turn off dictionary encoding when number of distinct keys is greater than threshold --- Key: HIVE-4324 URL: https://issues.apache.org/jira/browse/HIVE-4324 Project: Hive Issue Type: Sub-task Components: File Formats Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4324.1.patch.txt, HIVE-4324.D12045.1.patch Add a configurable threshold so that if the number of distinct values in a string column is greater than that fraction of non-null values, dictionary encoding is turned off. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732266#comment-13732266 ] Brock Noland commented on HIVE-5018: Hi, No reason to apologize! This is all part of becoming familiar with a project! We really appreciate the work you are putting in. Brock Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: AbstractGenericUDFEWAHBitmapBop.java, AbstractJoinTaskDispatcher.java, AbstractSMBJoinProc.java, BaseSemanticAnalyzer.java, BitmapIndexHandler.java, BlockMergeTask.java, BucketingSortingInferenceOptimizer.java, BucketingSortingOpProcFactory.java, BucketingSortingReduceSinkOptimizer.java, BucketizedHiveInputFormat.java, BucketMapJoinContext.java, ColumnPrunerProcFactory.java, ColumnStatsTask.java, CombineHiveInputFormat.java, CommonJoinOperator.java, CommonJoinTaskDispatcher.java, ConditionalResolverCommonJoin.java, ConditionalResolverSkewJoin.java, Context.java, CorrelationOptimizer.java, DDLSemanticAnalyzer.java, DDLTask.java, DemuxOperator.java, Driver.java, EmbeddedLockManager.java, ExecDriver.java, ExecReducer.java, EximUtil.java, ExplainTask.java, ExportSemanticAnalyzer.java, FileDump.java, FileSinkOperator.java, FunctionRegistry.java, GenMRFileSink1.java java/org/apache/hadoop/hive/ql/Context.java java/org/apache/hadoop/hive/ql/Driver.java java/org/apache/hadoop/hive/ql/QueryPlan.java java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java java/org/apache/hadoop/hive/ql/exec/DDLTask.java java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java java/org/apache/hadoop/hive/ql/exec/ExplainTask.java java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java java/org/apache/hadoop/hive/ql/exec/FetchOperator.java java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java java/org/apache/hadoop/hive/ql/exec/JoinUtil.java java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/MapOperator.java java/org/apache/hadoop/hive/ql/exec/MoveTask.java java/org/apache/hadoop/hive/ql/exec/MuxOperator.java java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java java/org/apache/hadoop/hive/ql/exec/PTFPersistence.java java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java java/org/apache/hadoop/hive/ql/exec/StatsTask.java java/org/apache/hadoop/hive/ql/exec/TaskFactory.java java/org/apache/hadoop/hive/ql/exec/UDFArgumentException.java java/org/apache/hadoop/hive/ql/exec/UnionOperator.java java/org/apache/hadoop/hive/ql/exec/Utilities.java java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java java/org/apache/hadoop/hive/ql/history/HiveHistory.java java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java java/org/apache/hadoop/hive/ql/io/NonSyncDataInputBuffer.java
[jira] [Commented] (HIVE-4123) The RLE encoding for ORC can be improved
[ https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732274#comment-13732274 ] Eric Hanson commented on HIVE-4123: --- This is a great addition. Are you going to update the vectorized reader as well to read the updated format? The RLE encoding for ORC can be improved Key: HIVE-4123 URL: https://issues.apache.org/jira/browse/HIVE-4123 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Prasanth J Labels: orcfile Fix For: 0.12.0 Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt, HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, HIVE-4123.5.txt, HIVE-4123.6.txt, ORC-Compression-Ratio-Comparison.xlsx The run length encoding of integers can be improved: * tighter bit packing * allow delta encoding * allow longer runs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding
[ https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4545: Attachment: HIVE-4545.2.patch HIVE-4545.2.patch - patch rebased to latest trunk HS2 should return describe table results without space padding -- Key: HIVE-4545 URL: https://issues.apache.org/jira/browse/HIVE-4545 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability
[ https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732293#comment-13732293 ] Ashutosh Chauhan commented on HIVE-4838: Actually memory monitoring I was talking of was about local task which generates hashtable which happens locally on client. To generate a hashtable (which is then ship to task nodes) we launch local job on client in separate process. Logic of memory management for this local task is convoluted (not of MR job which actually does the join in mapper). This local task monitors its own memory, but seems like MapredLocalTask is catching OOM exception anyways. One of this is not required. My thinking is there shouldn't be any memory monitoring and we should just catch OOM exception when it fails. Anyways join is converted into mapjoin only when size of small table is small (governed by config knob), so this OOM should be very very rare. So, my suggestion is to remove MemoryHandler altogether. ORC memory manger won't be a problem here, since ORC makes use of memory manager only while writing data and here we are dumping hashtable in java serialized format, so that wont be relevant. For similar reason (that this is local task) java.opts and io.sort.mb arent relevant either. Refactor MapJoin HashMap code to improve testability and readability Key: HIVE-4838 URL: https://issues.apache.org/jira/browse/HIVE-4838 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch MapJoin is an essential component for high performance joins in Hive and the current code has done great service for many years. However, the code is showing it's age and currently suffers from the following issues: * Uses static state via the MapJoinMetaData class to pass serialization metadata to the Key, Row classes. * The api of a logical Table Container is not defined and therefore it's unclear what apis HashMapWrapper needs to publicize. Additionally HashMapWrapper has many used public methods. * HashMapWrapper contains logic to serialize, test memory bounds, and implement the table container. Ideally these logical units could be seperated * HashTableSinkObjectCtx has unused fields and unused methods * CommonJoinOperator and children use ArrayList on left hand side when only List is required * There are unused classes MRU, DCLLItemm and classes which duplicate functionality MapJoinSingleKey and MapJoinDoubleKeys -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding
[ https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4545: Status: Patch Available (was: Open) HS2 should return describe table results without space padding -- Key: HIVE-4545 URL: https://issues.apache.org/jira/browse/HIVE-4545 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability
[ https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732571#comment-13732571 ] Brock Noland commented on HIVE-4838: What I was saying is the the local task JVM could be of different size than the mapred.child.java.opts on the server. I haven't heard of people hitting this much so it must not be too much of an issue. Good to know the ORC stuff is only used on write so it won't be an issue. I am fine with removing the memory handling and using OOM. I think that I will allocate a buffer of say 1MB and then when the OOM is hit free that buffer so we can cleanly exit and log. Refactor MapJoin HashMap code to improve testability and readability Key: HIVE-4838 URL: https://issues.apache.org/jira/browse/HIVE-4838 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch MapJoin is an essential component for high performance joins in Hive and the current code has done great service for many years. However, the code is showing it's age and currently suffers from the following issues: * Uses static state via the MapJoinMetaData class to pass serialization metadata to the Key, Row classes. * The api of a logical Table Container is not defined and therefore it's unclear what apis HashMapWrapper needs to publicize. Additionally HashMapWrapper has many used public methods. * HashMapWrapper contains logic to serialize, test memory bounds, and implement the table container. Ideally these logical units could be seperated * HashTableSinkObjectCtx has unused fields and unused methods * CommonJoinOperator and children use ArrayList on left hand side when only List is required * There are unused classes MRU, DCLLItemm and classes which duplicate functionality MapJoinSingleKey and MapJoinDoubleKeys -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4123) The RLE encoding for ORC can be improved
[ https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732578#comment-13732578 ] Prasanth J commented on HIVE-4123: -- [~ehans]Sure. I can take a look at changes required for vectorized reader to read from this new encodings. The RLE encoding for ORC can be improved Key: HIVE-4123 URL: https://issues.apache.org/jira/browse/HIVE-4123 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Prasanth J Labels: orcfile Fix For: 0.12.0 Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt, HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, HIVE-4123.5.txt, HIVE-4123.6.txt, ORC-Compression-Ratio-Comparison.xlsx The run length encoding of integers can be improved: * tighter bit packing * allow delta encoding * allow longer runs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding
[ https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4545: Status: Open (was: Patch Available) HS2 should return describe table results without space padding -- Key: HIVE-4545 URL: https://issues.apache.org/jira/browse/HIVE-4545 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
Thus far there hasn't been any dissent to managing our modules with maven. In addition there have been several comments positive on a move towards maven. I'd like to add Ivy seems to have issues managing multiple versions of libraries. For example in HIVE-3632 Ivy cache had to be cleared when testing patches that installed the new version of DataNucleus I have had the same issue on HIVE-4388. Requiring the deletion of the ivy cache is extremely painful for developers that don't have access to high bandwidth connections or live in areas far from California where most of these jars are hosted. I'd like to propose we move towards Maven. On Sat, Jul 27, 2013 at 1:19 PM, Mohammad Islam misla...@yahoo.com wrote: Yes hive build and test cases got convoluted as the project scope gradually increased. This is the time to take action! Based on my other Apache experiences, I prefer the option #3 Breakup the projects within our own source tree. Make multiple modules or sub-projects. By default, only key modules will be built. Maven could be a possible candidate. Regards, Mohammad From: Edward Capriolo edlinuxg...@gmail.com To: dev@hive.apache.org dev@hive.apache.org Sent: Saturday, July 27, 2013 7:03 AM Subject: Re: [Discuss] project chop up Or feel free to suggest different approach. I am used to managing software as multi-module maven projects. From a development standpoint if I was working on beeline, it would be nice to only require some of the sub-projects to be open in my IDE to do that. Also managing everything globally is not ideal. Hive's project layout, build, and test infrastructure is just funky. It has to do a few interesting things (shims, testing), but I do not think what we are doing justifies the massive ant build system we have. Ant is so ten years ago. On Sat, Jul 27, 2013 at 12:04 AM, Alan Gates ga...@hortonworks.com wrote: But I assume they'd still be a part of targets like package, tar, and binary? Making them compile and test separately and explicitly load the core Hive jars from maven/ivy seems reasonable. Alan. On Jul 26, 2013, at 8:40 PM, Brock Noland wrote: Hi, I think thats part of it but I'd like to decouple the downstream projects even further so that the only connection is the dependency on the hive jars. Brock On Jul 26, 2013 10:10 PM, Alan Gates ga...@hortonworks.com wrote: I'm not sure how this is different from what hcat does today. It needs Hive's jars to compile, so it's one of the last things in the compile step. Would moving the other modules you note to be in the same category be enough? Did you want to also make it so that the default ant target doesn't compile those? Alan. On Jul 26, 2013, at 4:09 PM, Edward Capriolo wrote: My mistake on saying hcat was a fork metastore. I had a brain fart for a moment. One way we could do this is create a folder called downstream. In our release step we can execute the downstream builds and then copy the files we need back. So nothing downstream will be on the classpath of the main project. This could help us breakup ql as well. Things like exotic file formats , and things that are pluggable like zk locking can go here. That might be overkill. For now we can focus on building downstream and hivethrift1might be the first thing to try to downstream. On Friday, July 26, 2013, Thejas Nair the...@hortonworks.com wrote: +1 to the idea of making the build of core hive and other downstream components independent. bq. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. The metastore code was never forked. Hcat was just using hive-metastore and making the metadata available to rest of hadoop (pig, java MR..). A lot of the changes that were driven by hcat goals were being made in hive-metastore. You can think of hcat as set of libraries that let pig and java MR use hive metastore. Since hcat is closely tied to hive-metastore, it makes sense to have them in same project. On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Also i believe hcatalog web can fall into the same designation. Question , hcatalog was initily a big hive-metastore fork. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. What is the status on that? I remember that was one of the core reasons we brought it in. On Friday, July 26, 2013, Edward Capriolo edlinuxg...@gmail.com wrote: I prefer option 3 as well. On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland br...@cloudera.com wrote: On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I have been developing my laptop on a duel core 2 GB Ram laptop for
[jira] [Commented] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold
[ https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732595#comment-13732595 ] Phabricator commented on HIVE-4324: --- ashutoshc has requested changes to the revision HIVE-4324 [jira] ORC Turn off dictionary encoding when number of distinct keys is greater than threshold. Mostly looks good, except for some minor nits. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/io/orc/OutStream.java:249 Is it better to modify clear to accept compress and suppress arguments ? ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java:768 Good to add a javadoc saying this Reader reads strings which doesn't have accompanying dictionary. ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java:838 Similarly here, javadoc of effect : This reader reads dictionary encoded strings. ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringRedBlackTree.java:166 This method could be package private? REVISION DETAIL https://reviews.facebook.net/D12045 BRANCH h-4324 ARCANIST PROJECT hive To: JIRA, ashutoshc, omalley ORC Turn off dictionary encoding when number of distinct keys is greater than threshold --- Key: HIVE-4324 URL: https://issues.apache.org/jira/browse/HIVE-4324 Project: Hive Issue Type: Sub-task Components: File Formats Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4324.1.patch.txt, HIVE-4324.D12045.1.patch Add a configurable threshold so that if the number of distinct values in a string column is greater than that fraction of non-null values, dictionary encoding is turned off. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability
[ https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732597#comment-13732597 ] Ashutosh Chauhan commented on HIVE-4838: bq. I am fine with removing the memory handling and using OOM. I think that I will allocate a buffer of say 1MB and then when the OOM is hit free that buffer so we can cleanly exit and log. Sounds good. Lets proceed with that. Though, I belief 256KB should be more than sufficient to generate exception and cleanly exit. Refactor MapJoin HashMap code to improve testability and readability Key: HIVE-4838 URL: https://issues.apache.org/jira/browse/HIVE-4838 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch MapJoin is an essential component for high performance joins in Hive and the current code has done great service for many years. However, the code is showing it's age and currently suffers from the following issues: * Uses static state via the MapJoinMetaData class to pass serialization metadata to the Key, Row classes. * The api of a logical Table Container is not defined and therefore it's unclear what apis HashMapWrapper needs to publicize. Additionally HashMapWrapper has many used public methods. * HashMapWrapper contains logic to serialize, test memory bounds, and implement the table container. Ideally these logical units could be seperated * HashTableSinkObjectCtx has unused fields and unused methods * CommonJoinOperator and children use ArrayList on left hand side when only List is required * There are unused classes MRU, DCLLItemm and classes which duplicate functionality MapJoinSingleKey and MapJoinDoubleKeys -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding
[ https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4545: Attachment: HIVE-4545.3.patch HIVE-4545.3.patch - updates test case to remove .trim() before comparison in two more places. HS2 should return describe table results without space padding -- Key: HIVE-4545 URL: https://issues.apache.org/jira/browse/HIVE-4545 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request 13383: HIVE-4545 - HS2 should return describe table results without space padding
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13383/ --- Review request for hive. Repository: hive-git Description --- HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 83f337b jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java f35a351 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 4dcb260 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java a85a19d ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java 0d71891 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatter.java 4c40034 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java 0f48674 service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 7254491 Diff: https://reviews.apache.org/r/13383/diff/ Testing --- Updated TestJdbcDriver2 unit tests Thanks, Thejas Nair
Re: Review Request 13383: HIVE-4545 - HS2 should return describe table results without space padding
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13383/ --- (Updated Aug. 7, 2013, 7:18 p.m.) Review request for hive. Changes --- HIVE-4545.3.patch - updates test case to remove .trim() before comparison in two more places. Bugs: HIVE-4545 https://issues.apache.org/jira/browse/HIVE-4545 Repository: hive-git Description --- HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 83f337b jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java f35a351 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 4dcb260 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java a85a19d ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java 0d71891 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatter.java 4c40034 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java 0f48674 service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 7254491 Diff: https://reviews.apache.org/r/13383/diff/ Testing --- Updated TestJdbcDriver2 unit tests Thanks, Thejas Nair
[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding
[ https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4545: Status: Patch Available (was: Open) Review board link - https://reviews.apache.org/r/13383/ HS2 should return describe table results without space padding -- Key: HIVE-4545 URL: https://issues.apache.org/jira/browse/HIVE-4545 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2482) Convenience UDFs for binary data type
[ https://issues.apache.org/jira/browse/HIVE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732616#comment-13732616 ] Ashutosh Chauhan commented on HIVE-2482: Hey [~mwagner] I have couple of minor comments. Can you create a RB or phabricator entry for the patch? Convenience UDFs for binary data type - Key: HIVE-2482 URL: https://issues.apache.org/jira/browse/HIVE-2482 Project: Hive Issue Type: New Feature Affects Versions: 0.9.0 Reporter: Ashutosh Chauhan Assignee: Mark Wagner Attachments: HIVE-2482.1.patch, HIVE-2482.2.patch HIVE-2380 introduced binary data type in Hive. It will be good to have following udfs to make it more useful: * UDF's to convert to/from hex string * UDF's to convert to/from string using a specific encoding * UDF's to convert to/from base64 string * UDF's to convert to/from non-string types using a particular serde -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport
[ https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732623#comment-13732623 ] Ashutosh Chauhan commented on HIVE-4911: [~amalakar] HIVE-4911-trunk-3.patch is the patch in entirety. We dont need anything else, right ? Enable QOP configuration for Hive Server 2 thrift transport --- Key: HIVE-4911 URL: https://issues.apache.org/jira/browse/HIVE-4911 Project: Hive Issue Type: New Feature Reporter: Arup Malakar Assignee: Arup Malakar Attachments: 20-build-temp-change-1.patch, 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, HIVE-4911-trunk-2.patch, HIVE-4911-trunk-3.patch The QoP for hive server 2 should be configurable to enable encryption. A new configuration should be exposed hive.server2.thrift.rpc.protection. This would give greater control configuring hive server 2 service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4789) FetchOperator fails on partitioned Avro data
[ https://issues.apache.org/jira/browse/HIVE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732634#comment-13732634 ] Ashutosh Chauhan commented on HIVE-4789: Your changes in MetaStoreUtils are indeed reasonable. I just wanted to make sure whether they are really needed. If you can come up with a testcase, which shows the failure without changes in MetaStoreUtils, that will make it easier to concretize why these changes are useful. FetchOperator fails on partitioned Avro data Key: HIVE-4789 URL: https://issues.apache.org/jira/browse/HIVE-4789 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Attachments: HIVE-4789.1.patch.txt, HIVE-4789.2.patch.txt HIVE-3953 fixed using partitioned avro tables for anything that used the MapOperator, but those that rely on FetchOperator still fail with the same error. e.g. {code} SELECT * FROM partitioned_avro LIMIT 5; SELECT * FROM partitioned_avro WHERE partition_col=value; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
I'd like to propose we move towards Maven. Big +1 on this. Most of the major apache projects(hadoop, hbase, avro etc.) are maven based. Also can't agree more that the current build system is frustrating to say the least. Another issue I had with the existing ant based system is that there are no checkpointing capabilities[1]. So if a 6 hour build fails after 5hr 30 minutes, most of the things even though successful have to be rebuilt which is very time consuming. Maven reactors have inbuilt support for lot of this stuff. [1] https://issues.apache.org/jira/browse/HIVE-3449. On Wed, Aug 7, 2013 at 2:06 PM, Brock Noland br...@cloudera.com wrote: Thus far there hasn't been any dissent to managing our modules with maven. In addition there have been several comments positive on a move towards maven. I'd like to add Ivy seems to have issues managing multiple versions of libraries. For example in HIVE-3632 Ivy cache had to be cleared when testing patches that installed the new version of DataNucleus I have had the same issue on HIVE-4388. Requiring the deletion of the ivy cache is extremely painful for developers that don't have access to high bandwidth connections or live in areas far from California where most of these jars are hosted. I'd like to propose we move towards Maven. On Sat, Jul 27, 2013 at 1:19 PM, Mohammad Islam misla...@yahoo.com wrote: Yes hive build and test cases got convoluted as the project scope gradually increased. This is the time to take action! Based on my other Apache experiences, I prefer the option #3 Breakup the projects within our own source tree. Make multiple modules or sub-projects. By default, only key modules will be built. Maven could be a possible candidate. Regards, Mohammad From: Edward Capriolo edlinuxg...@gmail.com To: dev@hive.apache.org dev@hive.apache.org Sent: Saturday, July 27, 2013 7:03 AM Subject: Re: [Discuss] project chop up Or feel free to suggest different approach. I am used to managing software as multi-module maven projects. From a development standpoint if I was working on beeline, it would be nice to only require some of the sub-projects to be open in my IDE to do that. Also managing everything globally is not ideal. Hive's project layout, build, and test infrastructure is just funky. It has to do a few interesting things (shims, testing), but I do not think what we are doing justifies the massive ant build system we have. Ant is so ten years ago. On Sat, Jul 27, 2013 at 12:04 AM, Alan Gates ga...@hortonworks.com wrote: But I assume they'd still be a part of targets like package, tar, and binary? Making them compile and test separately and explicitly load the core Hive jars from maven/ivy seems reasonable. Alan. On Jul 26, 2013, at 8:40 PM, Brock Noland wrote: Hi, I think thats part of it but I'd like to decouple the downstream projects even further so that the only connection is the dependency on the hive jars. Brock On Jul 26, 2013 10:10 PM, Alan Gates ga...@hortonworks.com wrote: I'm not sure how this is different from what hcat does today. It needs Hive's jars to compile, so it's one of the last things in the compile step. Would moving the other modules you note to be in the same category be enough? Did you want to also make it so that the default ant target doesn't compile those? Alan. On Jul 26, 2013, at 4:09 PM, Edward Capriolo wrote: My mistake on saying hcat was a fork metastore. I had a brain fart for a moment. One way we could do this is create a folder called downstream. In our release step we can execute the downstream builds and then copy the files we need back. So nothing downstream will be on the classpath of the main project. This could help us breakup ql as well. Things like exotic file formats , and things that are pluggable like zk locking can go here. That might be overkill. For now we can focus on building downstream and hivethrift1might be the first thing to try to downstream. On Friday, July 26, 2013, Thejas Nair the...@hortonworks.com wrote: +1 to the idea of making the build of core hive and other downstream components independent. bq. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. The metastore code was never forked. Hcat was just using hive-metastore and making the metadata available to rest of hadoop (pig, java MR..). A lot of the changes that were driven by hcat goals were being made in hive-metastore. You can think of hcat as set of libraries that let pig and java MR use hive metastore. Since hcat is closely tied to hive-metastore, it makes sense to have them in same project.
[jira] [Commented] (HIVE-4990) ORC seeks fails with non-zero offset or column projection
[ https://issues.apache.org/jira/browse/HIVE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732655#comment-13732655 ] Phabricator commented on HIVE-4990: --- ashutoshc has accepted the revision HIVE-4990 [jira] ORC seeks fails with non-zero offset or column projection. +1 REVISION DETAIL https://reviews.facebook.net/D12009 BRANCH trunk ARCANIST PROJECT hive To: JIRA, ashutoshc, omalley ORC seeks fails with non-zero offset or column projection - Key: HIVE-4990 URL: https://issues.apache.org/jira/browse/HIVE-4990 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.11.1 Attachments: HIVE-4990.D12009.1.patch The ORC reader gets exceptions when seeking with non-zero offsets or column projection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
I think that is a good idea. I have been thinking about it a lot. I especially hate how the offline build is now broken. However I think it is going to take some time. There are some tricks like how we build hive-exec jar that are not very clean to do in maven. I am very interested The last initiative we spoke about on list was moving from forest, I would like to finish/start that before we get onto the project chop up. On Wed, Aug 7, 2013 at 3:06 PM, Brock Noland br...@cloudera.com wrote: Thus far there hasn't been any dissent to managing our modules with maven. In addition there have been several comments positive on a move towards maven. I'd like to add Ivy seems to have issues managing multiple versions of libraries. For example in HIVE-3632 Ivy cache had to be cleared when testing patches that installed the new version of DataNucleus I have had the same issue on HIVE-4388. Requiring the deletion of the ivy cache is extremely painful for developers that don't have access to high bandwidth connections or live in areas far from California where most of these jars are hosted. I'd like to propose we move towards Maven. On Sat, Jul 27, 2013 at 1:19 PM, Mohammad Islam misla...@yahoo.com wrote: Yes hive build and test cases got convoluted as the project scope gradually increased. This is the time to take action! Based on my other Apache experiences, I prefer the option #3 Breakup the projects within our own source tree. Make multiple modules or sub-projects. By default, only key modules will be built. Maven could be a possible candidate. Regards, Mohammad From: Edward Capriolo edlinuxg...@gmail.com To: dev@hive.apache.org dev@hive.apache.org Sent: Saturday, July 27, 2013 7:03 AM Subject: Re: [Discuss] project chop up Or feel free to suggest different approach. I am used to managing software as multi-module maven projects. From a development standpoint if I was working on beeline, it would be nice to only require some of the sub-projects to be open in my IDE to do that. Also managing everything globally is not ideal. Hive's project layout, build, and test infrastructure is just funky. It has to do a few interesting things (shims, testing), but I do not think what we are doing justifies the massive ant build system we have. Ant is so ten years ago. On Sat, Jul 27, 2013 at 12:04 AM, Alan Gates ga...@hortonworks.com wrote: But I assume they'd still be a part of targets like package, tar, and binary? Making them compile and test separately and explicitly load the core Hive jars from maven/ivy seems reasonable. Alan. On Jul 26, 2013, at 8:40 PM, Brock Noland wrote: Hi, I think thats part of it but I'd like to decouple the downstream projects even further so that the only connection is the dependency on the hive jars. Brock On Jul 26, 2013 10:10 PM, Alan Gates ga...@hortonworks.com wrote: I'm not sure how this is different from what hcat does today. It needs Hive's jars to compile, so it's one of the last things in the compile step. Would moving the other modules you note to be in the same category be enough? Did you want to also make it so that the default ant target doesn't compile those? Alan. On Jul 26, 2013, at 4:09 PM, Edward Capriolo wrote: My mistake on saying hcat was a fork metastore. I had a brain fart for a moment. One way we could do this is create a folder called downstream. In our release step we can execute the downstream builds and then copy the files we need back. So nothing downstream will be on the classpath of the main project. This could help us breakup ql as well. Things like exotic file formats , and things that are pluggable like zk locking can go here. That might be overkill. For now we can focus on building downstream and hivethrift1might be the first thing to try to downstream. On Friday, July 26, 2013, Thejas Nair the...@hortonworks.com wrote: +1 to the idea of making the build of core hive and other downstream components independent. bq. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. The metastore code was never forked. Hcat was just using hive-metastore and making the metadata available to rest of hadoop (pig, java MR..). A lot of the changes that were driven by hcat goals were being made in hive-metastore. You can think of hcat as set of libraries that let pig and java MR use hive metastore. Since hcat is closely tied to hive-metastore, it makes sense to have them in same project. On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Also i believe hcatalog web can fall into the same
[jira] [Commented] (HIVE-3619) Hive JDBC driver should return a proper update-count of rows affected by query
[ https://issues.apache.org/jira/browse/HIVE-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732658#comment-13732658 ] Konstantin Boudnik commented on HIVE-3619: -- At least returning {{-1}} in the interim would be good, no? Hive JDBC driver should return a proper update-count of rows affected by query -- Key: HIVE-3619 URL: https://issues.apache.org/jira/browse/HIVE-3619 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Harsh J Priority: Minor HiveStatement.java currently has an explicit 0 return: public int getUpdateCount() throws SQLException { return 0; } Ideally we ought to emit the exact number of rows affected by the query statement itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
FYI I am still waiting on Infra for the CMS move: https://issues.apache.org/jira/browse/INFRA-6593 On Wed, Aug 7, 2013 at 2:57 PM, Edward Capriolo edlinuxg...@gmail.comwrote: I think that is a good idea. I have been thinking about it a lot. I especially hate how the offline build is now broken. However I think it is going to take some time. There are some tricks like how we build hive-exec jar that are not very clean to do in maven. I am very interested The last initiative we spoke about on list was moving from forest, I would like to finish/start that before we get onto the project chop up. On Wed, Aug 7, 2013 at 3:06 PM, Brock Noland br...@cloudera.com wrote: Thus far there hasn't been any dissent to managing our modules with maven. In addition there have been several comments positive on a move towards maven. I'd like to add Ivy seems to have issues managing multiple versions of libraries. For example in HIVE-3632 Ivy cache had to be cleared when testing patches that installed the new version of DataNucleus I have had the same issue on HIVE-4388. Requiring the deletion of the ivy cache is extremely painful for developers that don't have access to high bandwidth connections or live in areas far from California where most of these jars are hosted. I'd like to propose we move towards Maven. On Sat, Jul 27, 2013 at 1:19 PM, Mohammad Islam misla...@yahoo.com wrote: Yes hive build and test cases got convoluted as the project scope gradually increased. This is the time to take action! Based on my other Apache experiences, I prefer the option #3 Breakup the projects within our own source tree. Make multiple modules or sub-projects. By default, only key modules will be built. Maven could be a possible candidate. Regards, Mohammad From: Edward Capriolo edlinuxg...@gmail.com To: dev@hive.apache.org dev@hive.apache.org Sent: Saturday, July 27, 2013 7:03 AM Subject: Re: [Discuss] project chop up Or feel free to suggest different approach. I am used to managing software as multi-module maven projects. From a development standpoint if I was working on beeline, it would be nice to only require some of the sub-projects to be open in my IDE to do that. Also managing everything globally is not ideal. Hive's project layout, build, and test infrastructure is just funky. It has to do a few interesting things (shims, testing), but I do not think what we are doing justifies the massive ant build system we have. Ant is so ten years ago. On Sat, Jul 27, 2013 at 12:04 AM, Alan Gates ga...@hortonworks.com wrote: But I assume they'd still be a part of targets like package, tar, and binary? Making them compile and test separately and explicitly load the core Hive jars from maven/ivy seems reasonable. Alan. On Jul 26, 2013, at 8:40 PM, Brock Noland wrote: Hi, I think thats part of it but I'd like to decouple the downstream projects even further so that the only connection is the dependency on the hive jars. Brock On Jul 26, 2013 10:10 PM, Alan Gates ga...@hortonworks.com wrote: I'm not sure how this is different from what hcat does today. It needs Hive's jars to compile, so it's one of the last things in the compile step. Would moving the other modules you note to be in the same category be enough? Did you want to also make it so that the default ant target doesn't compile those? Alan. On Jul 26, 2013, at 4:09 PM, Edward Capriolo wrote: My mistake on saying hcat was a fork metastore. I had a brain fart for a moment. One way we could do this is create a folder called downstream. In our release step we can execute the downstream builds and then copy the files we need back. So nothing downstream will be on the classpath of the main project. This could help us breakup ql as well. Things like exotic file formats , and things that are pluggable like zk locking can go here. That might be overkill. For now we can focus on building downstream and hivethrift1might be the first thing to try to downstream. On Friday, July 26, 2013, Thejas Nair the...@hortonworks.com wrote: +1 to the idea of making the build of core hive and other downstream components independent. bq. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. The metastore code was never forked. Hcat was just using hive-metastore and making the metadata available to rest of hadoop (pig, java MR..). A lot of the changes that were driven by hcat goals were being made in hive-metastore. You can think of hcat
[jira] [Commented] (HIVE-4964) Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
[ https://issues.apache.org/jira/browse/HIVE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732671#comment-13732671 ] Ashutosh Chauhan commented on HIVE-4964: [~rhbutani] Are we removing some functionality in this patch or is it just dead code removal ? If we are removing some functionality can you outline what are you proposing to drop? Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced --- Key: HIVE-4964 URL: https://issues.apache.org/jira/browse/HIVE-4964 Project: Hive Issue Type: Bug Reporter: Harish Butani Priority: Minor Attachments: HIVE-4964.D11985.1.patch, HIVE-4964.D11985.2.patch There are still pieces of code that deal with: - supporting select expressions with Windowing - supporting a filter with windowing Need to do this before introducing Perf. improvements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport
[ https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732709#comment-13732709 ] Arup Malakar commented on HIVE-4911: [~ashutoshc]That is correct. 20-build* patch are temporary patch I used to build against 20 until HIVE-4991 is committed. Enable QOP configuration for Hive Server 2 thrift transport --- Key: HIVE-4911 URL: https://issues.apache.org/jira/browse/HIVE-4911 Project: Hive Issue Type: New Feature Reporter: Arup Malakar Assignee: Arup Malakar Attachments: 20-build-temp-change-1.patch, 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, HIVE-4911-trunk-2.patch, HIVE-4911-trunk-3.patch The QoP for hive server 2 should be configurable to enable encryption. A new configuration should be exposed hive.server2.thrift.rpc.protection. This would give greater control configuring hive server 2 service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5020) HCat reading null-key map entries causes NPE
Sushanth Sowmyan created HIVE-5020: -- Summary: HCat reading null-key map entries causes NPE Key: HIVE-5020 URL: https://issues.apache.org/jira/browse/HIVE-5020 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Currently, if someone has a null key in a map, HCatInputFormat will terminate with an NPE while trying to read it. {noformat} java.lang.NullPointerException at java.lang.String.compareTo(String.java:1167) at java.lang.String.compareTo(String.java:92) at java.util.TreeMap.put(TreeMap.java:545) at org.apache.hcatalog.data.HCatRecordSerDe.serializeMap(HCatRecordSerDe.java:222) at org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:198) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97) at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203) {noformat} This is because we use a TreeMap to preserve order of elements in the map when reading from the underlying storage/serde. This problem is easily fixed in a number of ways: a) Switch to HashMap, which allows null keys. That does not preserve order of keys, which should not be important for map fields, but if we desire that, we have a solution for that too - LinkedHashMap, which would both retain order and allow us to insert null keys into the map. b) Ignore null keyed entries - check if the field we read is null, and if it is, then ignore that item in the record altogether. This way, HCat is robust in what it does - it does not terminate with an NPE, and it does not allow null keys in maps that might be problematic to layers above us that are not used to seeing nulls as keys in maps. Why do I bring up the second fix? I bring it up because of the way we discovered this bug. When reading from an RCFile, we do not notice this bug. If the same query that produced the RCFile instead produces an Orcfile, and we try reading from it, we see this problem. RCFile seems to be quietly stripping any null key entries, whereas Orc retains them. This is why we didn't notice this problem for a long while, and suddenly, now, we are. Now, if we fix our code to allow nulls in map keys through to layers above, we expose layers above to this change, which may then cause them to break. (Technically, this is stretching the case because we already break now if they care) More importantly, though, we have a case now, where the same data will be exposed differently if it were stored as orc or if it were stored as rcfile. And as a layer that is supposed to make storage invisible to the end user, HCat should attempt to provide some consistency in how data behaves to the end user. That said... There is another important concern at hand here: nulls in map keys might be due to bad data(corruption or loading error), and by stripping them, we might be silently hiding that from the user. This is an important point that does steer me towards the former approach, of passing it on to layers above, and standardize on an understanding that null keys in maps are acceptable data that layers above us have to handle. After that, it could be taken on as a further consistency fix, to fix RCFile so that it allows nulls in map keys. Having gone through this discussion of standardization, another important question is whether or not there is actually a use-case for null keys in maps in data. If there isn't, maybe we shouldn't allow writing that in the first place, and both orc and rcfile must simply error out to the end user if they try to write a null map key? Well, it is true that it is possible that data errors lead to null keys, but it's also possible that the user wants to store a mapping for value transformations, and they might have a transformation for null as well. In the case I encountered it, they were writing out an intermediate table after having read from a sparse table using a custom input format that generated an arbitrary number of columns, and were using the map to store column name mappings that would eventually be written out to another table. That seems a valid use, and we shouldn't prevent users from this sort of usage. Another reason for not allowing null keys from a java perspective is locking and concurrency concerns, where locking on a null is a pain, per philosophical disagreements between Joshua Block and Doug Lea in the design of HashMap and ConcurrentHashMap. However, given that HCatalog reads are happening in a thread on a drone where there should be no parallel access of that record, and more importantly, this should strictly be used in a read-only kind of usage, we should not have to worry about that. Increasingly, my preference is to change to LinkedHashMaps to allow
Re: [Discuss] project chop up
On Wed, Aug 7, 2013 at 12:55 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: I'd like to propose we move towards Maven. Big +1 on this. Most of the major apache projects(hadoop, hbase, avro etc.) are maven based. A big +1 from me too. I actually took a pass at it a couple of months ago. Some of the hard part was that some of the test classes are in the wrong module that references classes in a later module. Obviously that prevents any kind of modular build. As an additional plus to Maven is that Maven includes tools to correct the project and module dependencies. -- Owen
[jira] [Updated] (HIVE-5020) HCat reading null-key map entries causes NPE
[ https://issues.apache.org/jira/browse/HIVE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5020: --- Description: Currently, if someone has a null key in a map, HCatInputFormat will terminate with an NPE while trying to read it. {noformat} java.lang.NullPointerException at java.lang.String.compareTo(String.java:1167) at java.lang.String.compareTo(String.java:92) at java.util.TreeMap.put(TreeMap.java:545) at org.apache.hcatalog.data.HCatRecordSerDe.serializeMap(HCatRecordSerDe.java:222) at org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:198) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97) at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203) {noformat} This is because we use a TreeMap to preserve order of elements in the map when reading from the underlying storage/serde. This problem is easily fixed in a number of ways: a) Switch to HashMap, which allows null keys. That does not preserve order of keys, which should not be important for map fields, but if we desire that, we have a solution for that too - LinkedHashMap, which would both retain order and allow us to insert null keys into the map. b) Ignore null keyed entries - check if the field we read is null, and if it is, then ignore that item in the record altogether. This way, HCat is robust in what it does - it does not terminate with an NPE, and it does not allow null keys in maps that might be problematic to layers above us that are not used to seeing nulls as keys in maps. Why do I bring up the second fix? I bring it up because of the way we discovered this bug. When reading from an RCFile, we do not notice this bug. If the same query that produced the RCFile instead produces an Orcfile, and we try reading from it, we see this problem. RCFile seems to be quietly stripping any null key entries, whereas Orc retains them. This is why we didn't notice this problem for a long while, and suddenly, now, we are. Now, if we fix our code to allow nulls in map keys through to layers above, we expose layers above to this change, which may then cause them to break. (Technically, this is stretching the case because we already break now if they care) More importantly, though, we have a case now, where the same data will be exposed differently if it were stored as orc or if it were stored as rcfile. And as a layer that is supposed to make storage invisible to the end user, HCat should attempt to provide some consistency in how data behaves to the end user. That said... There is another important concern at hand here: nulls in map keys might be due to bad data(corruption or loading error), and by stripping them, we might be silently hiding that from the user. This is an important point that does steer me towards the former approach, of passing it on to layers above, and standardize on an understanding that null keys in maps are acceptable data that layers above us have to handle. After that, it could be taken on as a further consistency fix, to fix RCFile so that it allows nulls in map keys. Having gone through this discussion of standardization, another important question is whether or not there is actually a use-case for null keys in maps in data. If there isn't, maybe we shouldn't allow writing that in the first place, and both orc and rcfile must simply error out to the end user if they try to write a null map key? Well, it is true that it is possible that data errors lead to null keys, but it's also possible that the user wants to store a mapping for value transformations, and they might have a transformation for null as well. In the case I encountered it, they were writing out an intermediate table after having read from a sparse table using a custom input format that generated an arbitrary number of columns, and were using the map to store column name mappings that would eventually be written out to another table. That seems a valid use, and we shouldn't prevent users from this sort of usage. Another reason for not allowing null keys from a java perspective is locking and concurrency concerns, where locking on a null is a pain, per philosophical disagreements between Joshua Bloch and Doug Lea in the design of HashMap and ConcurrentHashMap. However, given that HCatalog reads are happening in a thread on a drone where there should be no parallel access of that record, and more importantly, this should strictly be used in a read-only kind of usage, we should not have to worry about that. Increasingly, my preference is to change to LinkedHashMaps to allow null keys, and for consistency's sake, after this is tackled, to see if we should be fixing RCFile to allow null keys(this might be trickier since RCFile has a lot of other users that are
[jira] [Commented] (HIVE-5011) Dynamic partitioning in HCatalog broken on external tables
[ https://issues.apache.org/jira/browse/HIVE-5011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732718#comment-13732718 ] Daniel Dai commented on HIVE-5011: -- Looks good. In dynamic partition, we shall disable customized external partition location. We can support path pattern in the future, but that's more complex to do. +1 Dynamic partitioning in HCatalog broken on external tables -- Key: HIVE-5011 URL: https://issues.apache.org/jira/browse/HIVE-5011 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Attachments: HIVE-5011.patch Dynamic partitioning with HCatalog has been broken as a result of HCATALOG-500 trying to support user-set paths for external tables. The goal there was to be able to support other custom destinations apart from the normal hive-style partitions. However, it is not currently possible for users to set paths for dynamic ptn writes, since we don't support any way for users to specify patterns(like, say $\{rootdir\}/$v1.$v2/) into which writes happen, only locations, and the values for dyn. partitions are not known ahead of time. Also, specifying a custom path messes with the way dynamic ptn. code tries to determine what was written to where from the output committer, which means that even if we supported patterned-writes instead of location-writes, we still have to do some more deep diving into the output committer code to support it. Thus, my current proposal is that we honour writes to user-specified paths for external tables *ONLY* for static partition writes - i.e., if we can determine that the write is a dyn. ptn. write, we will ignore the user specification. (Note that this does not mean we ignore the table's external location - we honour that - we just don't honour any HCatStorer/etc provided additional location - we stick to what metadata tells us the root location is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5020) HCat reading null-key map entries causes NPE
[ https://issues.apache.org/jira/browse/HIVE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732719#comment-13732719 ] Edward Capriolo commented on HIVE-5020: --- If I had to hazard a guess I would say that the original implementation was about supporting thrift structures. Possibly if thrift does not support this case that design was not carried over. Personally I think we SHOULD support NULL key and NULL value in maps. The map need not be sorted. HCat reading null-key map entries causes NPE Key: HIVE-5020 URL: https://issues.apache.org/jira/browse/HIVE-5020 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Currently, if someone has a null key in a map, HCatInputFormat will terminate with an NPE while trying to read it. {noformat} java.lang.NullPointerException at java.lang.String.compareTo(String.java:1167) at java.lang.String.compareTo(String.java:92) at java.util.TreeMap.put(TreeMap.java:545) at org.apache.hcatalog.data.HCatRecordSerDe.serializeMap(HCatRecordSerDe.java:222) at org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:198) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97) at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203) {noformat} This is because we use a TreeMap to preserve order of elements in the map when reading from the underlying storage/serde. This problem is easily fixed in a number of ways: a) Switch to HashMap, which allows null keys. That does not preserve order of keys, which should not be important for map fields, but if we desire that, we have a solution for that too - LinkedHashMap, which would both retain order and allow us to insert null keys into the map. b) Ignore null keyed entries - check if the field we read is null, and if it is, then ignore that item in the record altogether. This way, HCat is robust in what it does - it does not terminate with an NPE, and it does not allow null keys in maps that might be problematic to layers above us that are not used to seeing nulls as keys in maps. Why do I bring up the second fix? I bring it up because of the way we discovered this bug. When reading from an RCFile, we do not notice this bug. If the same query that produced the RCFile instead produces an Orcfile, and we try reading from it, we see this problem. RCFile seems to be quietly stripping any null key entries, whereas Orc retains them. This is why we didn't notice this problem for a long while, and suddenly, now, we are. Now, if we fix our code to allow nulls in map keys through to layers above, we expose layers above to this change, which may then cause them to break. (Technically, this is stretching the case because we already break now if they care) More importantly, though, we have a case now, where the same data will be exposed differently if it were stored as orc or if it were stored as rcfile. And as a layer that is supposed to make storage invisible to the end user, HCat should attempt to provide some consistency in how data behaves to the end user. That said... There is another important concern at hand here: nulls in map keys might be due to bad data(corruption or loading error), and by stripping them, we might be silently hiding that from the user. This is an important point that does steer me towards the former approach, of passing it on to layers above, and standardize on an understanding that null keys in maps are acceptable data that layers above us have to handle. After that, it could be taken on as a further consistency fix, to fix RCFile so that it allows nulls in map keys. Having gone through this discussion of standardization, another important question is whether or not there is actually a use-case for null keys in maps in data. If there isn't, maybe we shouldn't allow writing that in the first place, and both orc and rcfile must simply error out to the end user if they try to write a null map key? Well, it is true that it is possible that data errors lead to null keys, but it's also possible that the user wants to store a mapping for value transformations, and they might have a transformation for null as well. In the case I encountered it, they were writing out an intermediate table after having read from a sparse table using a custom input format that generated an arbitrary number of columns, and were using the map to store column name mappings that would eventually be written out to another table. That seems a valid use, and we shouldn't prevent users from this sort of usage. Another
[jira] [Updated] (HIVE-4886) beeline code should have apache license headers
[ https://issues.apache.org/jira/browse/HIVE-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4886: Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) I just committed this. Thanks, Thejas! beeline code should have apache license headers --- Key: HIVE-4886 URL: https://issues.apache.org/jira/browse/HIVE-4886 Project: Hive Issue Type: Task Components: JDBC Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4886.2.patch, HIVE-4886.patch The beeline jdbc client added as part of hive server2 changes is based on SQLLine. As beeline is modified version of SQLLine and further modifications are also under apache license, the license headers of these files need to be replaced with apache license headers. We already have the license text of SQLLine in LICENSE file . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
Some of the hard part was that some of the test classes are in the wrong module that references classes in a later module. I think the modules will have to be able to reference each other in many cases. Serde and QL are tightly coupled. QL is really too large and we should find a way to cut that up. Part of this problem is the q.tests I think one way to handle this is to only allow unit tests inside the module. I imagine running all the q tests would be done in a final module hive-qtest. Or possibly two final modules hive-qtest hive-qtest-extra (tangential things like UDFS and input formats not core to hive) On Wed, Aug 7, 2013 at 4:49 PM, Owen O'Malley omal...@apache.org wrote: On Wed, Aug 7, 2013 at 12:55 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: I'd like to propose we move towards Maven. Big +1 on this. Most of the major apache projects(hadoop, hbase, avro etc.) are maven based. A big +1 from me too. I actually took a pass at it a couple of months ago. Some of the hard part was that some of the test classes are in the wrong module that references classes in a later module. Obviously that prevents any kind of modular build. As an additional plus to Maven is that Maven includes tools to correct the project and module dependencies. -- Owen
[jira] [Commented] (HIVE-4964) Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
[ https://issues.apache.org/jira/browse/HIVE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732741#comment-13732741 ] Harish Butani commented on HIVE-4964: - No just dead code removal. This code was handling: - the 'having clause' based filters we originally supported with windowing; - and also the use of 'lead/lag' udfs outside of UDAFs. We decided to remove support for these, if i recall, because: - associating having with windowing would be confusing to users. - lead/lag udf invocations when multiple partitioning are involved are ambiguous. In some cases it is not clear what order to evaluate the window expressions. We have already removed these features from the Semantic Analyzer. So they are not exposed to the user. This is a cleanup step of the Translator/PTFOperator that still had code to handle these cases. Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced --- Key: HIVE-4964 URL: https://issues.apache.org/jira/browse/HIVE-4964 Project: Hive Issue Type: Bug Reporter: Harish Butani Priority: Minor Attachments: HIVE-4964.D11985.1.patch, HIVE-4964.D11985.2.patch There are still pieces of code that deal with: - supporting select expressions with Windowing - supporting a filter with windowing Need to do this before introducing Perf. improvements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
On Wed, Aug 7, 2013 at 2:04 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Some of the hard part was that some of the test classes are in the wrong module that references classes in a later module. I think the modules will have to be able to reference each other in many cases. Serde and QL are tightly coupled. QL is really too large and we should find a way to cut that up. Of course the modules need to reference each other. The problematic test classes depend on modules lower in the tree, so they form a cycle in dependency DAG. It only works in the ant build because it compiles all of the modules before it does the test-compile in any of the modules. -- Owen Part of this problem is the q.tests I think one way to handle this is to only allow unit tests inside the module. I imagine running all the q tests would be done in a final module hive-qtest. Or possibly two final modules hive-qtest hive-qtest-extra (tangential things like UDFS and input formats not core to hive) On Wed, Aug 7, 2013 at 4:49 PM, Owen O'Malley omal...@apache.org wrote: On Wed, Aug 7, 2013 at 12:55 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: I'd like to propose we move towards Maven. Big +1 on this. Most of the major apache projects(hadoop, hbase, avro etc.) are maven based. A big +1 from me too. I actually took a pass at it a couple of months ago. Some of the hard part was that some of the test classes are in the wrong module that references classes in a later module. Obviously that prevents any kind of modular build. As an additional plus to Maven is that Maven includes tools to correct the project and module dependencies. -- Owen
[jira] [Updated] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold
[ https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4324: -- Attachment: HIVE-4324.D12045.2.patch omalley updated the revision HIVE-4324 [jira] ORC Turn off dictionary encoding when number of distinct keys is greater than threshold. I addressed Ashutosh's feedback. Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D12045 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D12045?vs=37185id=37245#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OutStream.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringRedBlackTree.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestFileDump.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java ql/src/test/queries/clientpositive/orc_dictionary_threshold.q ql/src/test/resources/orc-file-dump-dictionary-threshold.out ql/src/test/results/clientpositive/orc_dictionary_threshold.q.out To: JIRA, ashutoshc, omalley ORC Turn off dictionary encoding when number of distinct keys is greater than threshold --- Key: HIVE-4324 URL: https://issues.apache.org/jira/browse/HIVE-4324 Project: Hive Issue Type: Sub-task Components: File Formats Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4324.1.patch.txt, HIVE-4324.D12045.1.patch, HIVE-4324.D12045.2.patch Add a configurable threshold so that if the number of distinct values in a string column is greater than that fraction of non-null values, dictionary encoding is turned off. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold
[ https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4324: Fix Version/s: 0.12.0 Status: Patch Available (was: Open) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold --- Key: HIVE-4324 URL: https://issues.apache.org/jira/browse/HIVE-4324 Project: Hive Issue Type: Sub-task Components: File Formats Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.12.0 Attachments: HIVE-4324.1.patch.txt, HIVE-4324.D12045.1.patch, HIVE-4324.D12045.2.patch Add a configurable threshold so that if the number of distinct values in a string column is greater than that fraction of non-null values, dictionary encoding is turned off. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732780#comment-13732780 ] Viraj Bhat commented on HIVE-4331: -- Hi Ashutosh, I have created 2 review requests one which changes files in the HCatalog contrib and the other in Hive. Hope this helps in the review process. Hive: https://reviews.facebook.net/D12063 HCatalog: https://reviews.facebook.net/D12069 Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4123) The RLE encoding for ORC can be improved
[ https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-4123: - Attachment: (was: ORC-Compression-Ratio-Comparison.xlsx) The RLE encoding for ORC can be improved Key: HIVE-4123 URL: https://issues.apache.org/jira/browse/HIVE-4123 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Prasanth J Labels: orcfile Fix For: 0.12.0 Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt, HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, HIVE-4123.5.txt, HIVE-4123.6.txt, ORC-Compression-Ratio-Comparison.xlsx The run length encoding of integers can be improved: * tighter bit packing * allow delta encoding * allow longer runs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira