[jira] [Created] (HIVE-12303) HCatRecordSerDe throw a IndexOutOfBoundsException

2015-10-30 Thread Xiaowei Wang (JIRA)
Xiaowei Wang created HIVE-12303:
---

 Summary:  HCatRecordSerDe  throw a IndexOutOfBoundsException 
 Key: HIVE-12303
 URL: https://issues.apache.org/jira/browse/HIVE-12303
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 1.2.1, 0.14.0
Reporter: Xiaowei Wang
Assignee: Sushanth Sowmyan
 Fix For: 1.2.1


When access hive table using hcatlog in Pig,sometime it throws a exception !

Exception

{noformat}
2015-10-30 06:44:35,219 WARN [Thread-4] org.apache.hadoop.mapred.YarnChild: 
Exception running child : org.apache.pig.backend.executionengine.ExecException: 
ERROR 6018: Error converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:59)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.IndexOutOfBoundsException: Index: 24, Size: 24
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hive.hcatalog.data.HCatRecordSerDe.serializeStruct(HCatRecordSerDe.java:175)
at 
org.apache.hive.hcatalog.data.HCatRecordSerDe.serializeList(HCatRecordSerDe.java:244)
at 
org.apache.hive.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:196)
at 
org.apache.hive.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53)
at 
org.apache.hive.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97)
at 
org.apache.hive.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:204)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:63)
... 13 more

{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12302) Use KryoPool instead of thread-local caching

2015-10-30 Thread Gopal V (JIRA)
Gopal V created HIVE-12302:
--

 Summary: Use KryoPool instead of thread-local caching
 Key: HIVE-12302
 URL: https://issues.apache.org/jira/browse/HIVE-12302
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 2.0.0
Reporter: Gopal V
Assignee: Gopal V


Kryo 3.x introduces a Pooling mechanism for Kryo

https://github.com/EsotericSoftware/kryo#pooling-kryo-instances

{code}
// Build pool with SoftReferences enabled (optional)
KryoPool pool = new KryoPool.Builder(factory).softReferences().build();
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 39735: HIVE-12215 Exchange partition does not show outputs field for post/pre execute hooks

2015-10-30 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39735/#review104613
---



metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java (line 
464)


Tab/space.


- Xuefu Zhang


On Oct. 29, 2015, 4:50 p.m., Aihua Xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39735/
> ---
> 
> (Updated Oct. 29, 2015, 4:50 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-12215 Exchange partition does not show outputs field for post/pre 
> execute hooks
> 
> 
> Diffs
> -
> 
>   metastore/if/hive_metastore.thrift 3e30f56 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h c8f16a7 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp a82c363 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
> 9eca65c 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
>  0c67416 
>   metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php e922d7d 
>   metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 
> 8dba17b 
>   metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
> 59c7b94 
>   metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 7b93158 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> cf2e25b 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
> 3960f5d 
>   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
> f3a23f5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java dcac9ca 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java cef297a 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> b4546e1 
>   ql/src/test/results/clientnegative/exchange_partition.q.out 8622615 
>   ql/src/test/results/clientpositive/exchange_partition.q.out 5b21eaf 
>   ql/src/test/results/clientpositive/exchange_partition2.q.out 8c7c583 
>   ql/src/test/results/clientpositive/exchange_partition3.q.out 3815861 
>   ql/src/test/results/clientpositive/exchgpartition2lel.q.out 5997d6b 
> 
> Diff: https://reviews.apache.org/r/39735/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Aihua Xu
> 
>



[jira] [Created] (HIVE-12309) TableScan should colStats when available for better data size estimate

2015-10-30 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-12309:
---

 Summary: TableScan should colStats when available for better data 
size estimate
 Key: HIVE-12309
 URL: https://issues.apache.org/jira/browse/HIVE-12309
 Project: Hive
  Issue Type: Improvement
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Currently, all other operators use column stats to figure out data size, 
whereas TableScan relies on rawDataSize. This inconsistency can result in an 
inconsistency where TS may have lower Datasize then subsequent operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12310) Update memory estimation login in TopNHash

2015-10-30 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-12310:


 Summary: Update memory estimation login in TopNHash
 Key: HIVE-12310
 URL: https://issues.apache.org/jira/browse/HIVE-12310
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Thejas M Nair


HIVE-12084 changes TopNHash to use Runtime.getRuntime().freeMemory() for 
finding available memory.
However, it does not give the all the memory it could use, it ignores 
unallocated memory. This is because the heap size of jvm grows up to max heap 
size (-Xmx) as per it needs. totalMemory() gives total heap space it has 
allocated, and freeMemory() is the free memory within that.
See http://i.stack.imgur.com/GjuwM.png and 
http://stackoverflow.com/questions/3571203/what-is-the-exact-meaning-of-runtime-getruntime-totalmemory-and-freememory
 .
So instead of using Runtime.getRuntime().freeMemory() , I think it should use 
maxMemory() - totalMemory() + freeMemory()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12308) Make ParseContext::semanticInputs a map

2015-10-30 Thread Hari Sankar Sivarama Subramaniyan (JIRA)
Hari Sankar Sivarama Subramaniyan created HIVE-12308:


 Summary: Make ParseContext::semanticInputs a map
 Key: HIVE-12308
 URL: https://issues.apache.org/jira/browse/HIVE-12308
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan


Follow up jira for HIVE-7723.  Explain plan for complex query with lots of 
partitions is slow due to in-efficient collection used to find a matching 
ReadEntity. 

As part of HIVE-7723, we will create a map during 
PlanUtils.addPartitionInputs(), we should start with a map if possible in 
ParseContext::semanticInputs to save CPU burn on this additional operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Re: deprecating MR in the first release of Hive 2.0

2015-10-30 Thread Thejas Nair
the jira - https://issues.apache.org/jira/browse/HIVE-12300


On Mon, Oct 26, 2015 at 2:58 PM, Sergey Shelukhin
 wrote:
> There appear to be no objections, so I will start by filing a JIRA :)
>
> On 15/10/22, 14:38, "Thejas Nair"  wrote:
>
>>(Adding [DISCUSS] to subject to bring it to attention of wider audience.)
>>
>>+1 Given how much investment is going into Tez and Spark execution
>>modes, it makes sense to convey that better to the user community and
>>recommend the use of the new modes over MR. Users who choose those
>>modes are going to get better experience, and it will help to improve
>>the overall perception of Hive.
>>
>>Once most users have moved to the new modes, we can start looking into
>>removing MR support. (Though that is likely to take a while).
>>
>>
>>On Wed, Oct 21, 2015 at 9:44 PM, Sergey Shelukhin
>> wrote:
>>> We have discussed the removal of hadoop-1 and MR support in Hive 2 line
>>>in the past..
>>> Hadoop-1 removal seems to be non-controversial and on track; before we
>>>cut the first release of Hive 2, I propose we deprecate MR.
>>>
>>> Tez and Spark engines provide vast perf improvements over MR;
>>> Execution optimization work by most contributors for a long time has
>>>been done for these engines and is not portable to MR, so it is
>>>languishing further;
>>> At the same time, supporting additional code has other development
>>>costs for new features or bugs, plus we have to run tests for it both in
>>>Apache and for local changes and to deploy code.
>>>
>>> However, MR is hard to remove. Plus, it may provide a baseline for some
>>>bugs in other engines (which is not bulletproof since MR logic can be
>>>incorrect), or to mock during perf benchmarks.
>>>
>>> Therefore, I propose that for now we add deprecation warnings
>>>suggesting the other alternatives:
>>>
>>>   *   to Hive configuration documentation.
>>>   *   to Hive wiki.
>>>   *   to release notes on Hive 2.
>>>   *   in Beeline and CLI when using MR.
>>>
>>> Additionally, I propose we remove Minimr test driver from HiveQA runs
>>>for master.
>>>
>>> What do you think?
>>
>


[jira] [Created] (HIVE-12306) hbase_queries.q fails in Hive 1.3.0

2015-10-30 Thread Chaoyu Tang (JIRA)
Chaoyu Tang created HIVE-12306:
--

 Summary: hbase_queries.q fails in Hive 1.3.0
 Key: HIVE-12306
 URL: https://issues.apache.org/jira/browse/HIVE-12306
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
Priority: Trivial


hbase_queries.q is failing (only in version 1.3.0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-10-30 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-12304:
---

 Summary: "drop database cascade" needs to unregister functions
 Key: HIVE-12304
 URL: https://issues.apache.org/jira/browse/HIVE-12304
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Currently "drop database cascade" command doesn't unregister the functions 
under the database. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [CONF] Apache Hive > GettingStarted

2015-10-30 Thread Alan Gates
This isn't correct.  If we just want to build jars, the command is "mvn 
clean install -Phadoop-1".  If we want to build packages the command is 
"mvn clean package -Phadoop-1,dist".  I wasn't sure which was intended 
and opted for building packages.  We should clarify which is intended 
and then give the right command for it.


Alan.

Lefty Leverenz (Confluence) wrote:
 
	Lefty Leverenz *edited* a page


*Change comment:* restore "install" to branch-1 build command, 
capitalize maven


page icon 
 



	GettingStarted 
 



...

In branch-1, Hive supports both Hadoop 1.x and 2.x.  You will need to 
specify which version of Hadoop to build against via a maven Maven 
profile.  To build against Hadoop 1.x use the profile |hadoop-1|; for 
Hadoop 2.x use |hadoop-2|.  For example to build against Hadoop 1.x, 
the above mvn command becomes:


No Format
   $ mvn cleaninstallpackage -Phadoop-1,dist


Compile Hive Prior to 0.13 on Hadoop 0.20

...

View page Icon 
 
	View page 
 
	.


Add comment Icon 
 
	Add comment 
 
	.


Like Icon 
 
	Like 
 



Stop watching page 
 
	.


Manage notifications 
 




Confluence logo big

This message was sent by Atlassian Confluence 5.8.4



[jira] [Created] (HIVE-12305) CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not pull up constant expressions

2015-10-30 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-12305:
--

 Summary: CBO: Calcite Operator To Hive Operator (Calcite Return 
Path): UDAF can not pull up constant expressions
 Key: HIVE-12305
 URL: https://issues.apache.org/jira/browse/HIVE-12305
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


to repro, run annotate_stats_groupby.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12307) TransactionBatch.close() must abort any remaining transactions in the batch

2015-10-30 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-12307:
-

 Summary: TransactionBatch.close() must abort any remaining 
transactions in the batch
 Key: HIVE-12307
 URL: https://issues.apache.org/jira/browse/HIVE-12307
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Transactions
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


When the client of TransactionBatch API encounters an error it must close() the 
batch and start a new one.  This prevents attempts to continue writing to a 
file that may damaged in some way.

The close() should ensure to abort the any txns that still remain in the batch 
and close (best effort) all the files it's writing to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)