[jira] [Created] (HIVE-11884) LLAP: Fix discrepancies with metadata_only_queries_with_filters.q

2015-09-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-11884:


 Summary: LLAP: Fix discrepancies with 
metadata_only_queries_with_filters.q
 Key: HIVE-11884
 URL: https://issues.apache.org/jira/browse/HIVE-11884
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11892) UDTF run in local fetch task does not return rows forwarded during GenericUDTF.close()

2015-09-18 Thread Jason Dere (JIRA)
Jason Dere created HIVE-11892:
-

 Summary: UDTF run in local fetch task does not return rows 
forwarded during GenericUDTF.close()
 Key: HIVE-11892
 URL: https://issues.apache.org/jira/browse/HIVE-11892
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere


Using the example UDTF GenericUDTFCount2, which is part of hive-contrib:

{noformat}
create temporary function udtfCount2 as 
'org.apache.hadoop.hive.contrib.udtf.example.GenericUDTFCount2';

set hive.fetch.task.conversion=minimal;
-- Task created, correct output (2 rows)
select udtfCount2() from src;

set hive.fetch.task.conversion=more;
-- Runs in local task, incorrect output (0 rows)
select udtfCount2() from src;
{noformat}






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11894) CBO: Calcite Operator To Hive Operator (Calcite Return Path): correct table column name in CTAS queries

2015-09-18 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-11894:
--

 Summary: CBO: Calcite Operator To Hive Operator (Calcite Return 
Path): correct table column name in CTAS queries
 Key: HIVE-11894
 URL: https://issues.apache.org/jira/browse/HIVE-11894
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


To repro, run lineage2.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Hive Start Up Time Manifolds Greater than Execution Time

2015-09-18 Thread Sergey Shelukhin
Actually, on 2nd though, even listing directories (which is necessary to
launch the job) could take long.
If there are any client logs, you can try to take a look to see where the
time is spent.
If you are running under Hive CLI, the logs would be in
/tmp/$USER/hive.log by default.

On 15/9/18, 11:46, "Sergey Shelukhin"  wrote:

>Which version of the Hive, and file format, are you using?
>It could be either reading file footers for ORC - in recent version
>there’s way to disable that (set hive.exec.orc.split.strategy=BI); or
>some similar feature for other formats that I’m not immediately familiar
>with.
>It could also be slow metastore calls.
>
>From: Sreenath >
>Reply-To: "u...@hive.apache.org"
>>
>Date: Friday, September 18, 2015 at 02:24
>To: "dev@hive.apache.org"
>>,
>"u...@hive.apache.org"
>>
>Subject: Hive Start Up Time Manifolds Greater than Execution Time
>
>Hi All,
>
>Something interesting fell to my notice last day when i was using hive
>for some queries. The time taken by hive to launch a mapreduce job was
>manifolds higher than the time taken by hadoop to actually execute it.
>This is the table details on which the query is being fired.
>
>CREATE EXTERNAL TABLE A
>(
>user_id string,
>stage strig,
>url string
>)
>PARTITIONED BY (dt string , id string)
>
>All the data for table is stored in S3 and each day there will be around
>2000 unique id i.e 2000 partitions being added daily. And we can assume
>that each partition has on a average 100MB gzip compressed data.
>Now when I run a query like "SELECT DISTINCT user_id FROM A  WHERE
>dt>='20150101' and dt <= '20150401'" ie over a period of 3 months approx
>6 partitions it takes hive approximately 2 hrs to launch the map
>reduce job and the launched job just finishes in 20 min. So was wondering
>if someone can help me in understanding what hive is doing in this 2 hrs ?
>Would really appreciate some help here . Thanks in advance 
>
>
>Best,
>Sreenath
>



[jira] [Created] (HIVE-11893) LLAP: Update llap golden files after master merge

2015-09-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-11893:


 Summary: LLAP: Update llap golden files after master merge
 Key: HIVE-11893
 URL: https://issues.apache.org/jira/browse/HIVE-11893
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11897) JDO rollback can throw pointless exceptions

2015-09-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-11897:
---

 Summary: JDO rollback can throw pointless exceptions
 Key: HIVE-11897
 URL: https://issues.apache.org/jira/browse/HIVE-11897
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11898) support default partition in metasotredirectsql

2015-09-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-11898:
---

 Summary: support default partition in metasotredirectsql
 Key: HIVE-11898
 URL: https://issues.apache.org/jira/browse/HIVE-11898
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11895) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix udaf_percentile_approx_23.q

2015-09-18 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-11895:
--

 Summary: CBO: Calcite Operator To Hive Operator (Calcite Return 
Path): fix udaf_percentile_approx_23.q
 Key: HIVE-11895
 URL: https://issues.apache.org/jira/browse/HIVE-11895
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


Due to a type conversion problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11896) CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive default partition when inserting data

2015-09-18 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-11896:
--

 Summary: CBO: Calcite Operator To Hive Operator (Calcite Return 
Path): deal with hive default partition when inserting data
 Key: HIVE-11896
 URL: https://issues.apache.org/jira/browse/HIVE-11896
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


To repro, run dynpart_sort_opt_vectorization.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11877) LLAP: Fix ordering difference in acid_vectorization_partition.q

2015-09-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-11877:


 Summary: LLAP: Fix ordering difference in 
acid_vectorization_partition.q
 Key: HIVE-11877
 URL: https://issues.apache.org/jira/browse/HIVE-11877
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11885) LLAP: Remove unused/old golden files

2015-09-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-11885:


 Summary: LLAP: Remove unused/old golden files
 Key: HIVE-11885
 URL: https://issues.apache.org/jira/browse/HIVE-11885
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Hive Start Up Time Manifolds Greater than Execution Time

2015-09-18 Thread Sergey Shelukhin
Which version of the Hive, and file format, are you using?
It could be either reading file footers for ORC - in recent version there’s way 
to disable that (set hive.exec.orc.split.strategy=BI); or some similar feature 
for other formats that I’m not immediately familiar with.
It could also be slow metastore calls.

From: Sreenath >
Reply-To: "u...@hive.apache.org" 
>
Date: Friday, September 18, 2015 at 02:24
To: "dev@hive.apache.org" 
>, 
"u...@hive.apache.org" 
>
Subject: Hive Start Up Time Manifolds Greater than Execution Time

Hi All,

Something interesting fell to my notice last day when i was using hive for some 
queries. The time taken by hive to launch a mapreduce job was manifolds higher 
than the time taken by hadoop to actually execute it.
This is the table details on which the query is being fired.

CREATE EXTERNAL TABLE A
(
user_id string,
stage strig,
url string
)
PARTITIONED BY (dt string , id string)

All the data for table is stored in S3 and each day there will be around 2000 
unique id i.e 2000 partitions being added daily. And we can assume that each 
partition has on a average 100MB gzip compressed data.
Now when I run a query like "SELECT DISTINCT user_id FROM A  WHERE 
dt>='20150101' and dt <= '20150401'" ie over a period of 3 months approx 6 
partitions it takes hive approximately 2 hrs to launch the map reduce job and 
the launched job just finishes in 20 min. So was wondering if someone can help 
me in understanding what hive is doing in this 2 hrs ?
Would really appreciate some help here . Thanks in advance 


Best,
Sreenath



Review Request 38503: Support special characters in quoted table names

2015-09-18 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38503/
---

Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Repository: hive-git


Description
---

Right now table names can only be "[a-zA-z_0-9]+". This patch tries to 
investigate how much change there should be if we would like to support special 
characters, e.g., "/" in table names.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f29da2 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
ee20430 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
a80f686 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
f88f4dd 
  metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java bc0f6e3 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 4030075 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 210736b 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java be5a593 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/EmbeddedLockManager.java 
7d7e7c0 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java fadd074 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManagerImpl.java ed022d9 
  
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
 fb954d8 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java c78e8f4 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 52ed4a3 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java
 4966d89 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 
8b7a2e8 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 1e2feaa 
  ql/src/test/queries/clientnegative/special_character_in_tabnames_1.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/special_character_in_tabnames_1.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/special_character_in_tabnames_2.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/special_character_in_tabnames_3.q 
PRE-CREATION 
  ql/src/test/results/clientnegative/special_character_in_tabnames_1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/special_character_in_tabnames_1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/special_character_in_tabnames_2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/special_character_in_tabnames_3.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/38503/diff/


Testing
---


Thanks,

pengcheng xiong



[jira] [Created] (HIVE-11889) Add unit test for HIVE-11449

2015-09-18 Thread Wei Zheng (JIRA)
Wei Zheng created HIVE-11889:


 Summary: Add unit test for HIVE-11449
 Key: HIVE-11889
 URL: https://issues.apache.org/jira/browse/HIVE-11889
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.3.0, 2.0.0
Reporter: Wei Zheng
Assignee: Wei Zheng






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11891) Add basic performance logging at trace level to metastore calls

2015-09-18 Thread Brock Noland (JIRA)
Brock Noland created HIVE-11891:
---

 Summary: Add basic performance logging at trace level to metastore 
calls
 Key: HIVE-11891
 URL: https://issues.apache.org/jira/browse/HIVE-11891
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 1.1.0, 1.2.0, 1.0.0
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Fix For: 2.0.0


At present it's extremely difficult to debug slow calls to the metastore. 
Ideally there would be some basic means of doing so, disabled by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Hive Start Up Time Manifolds Greater than Execution Time

2015-09-18 Thread Abhishek Das
There is a blogspot regarding S3 optimization. You might find this post
useful.

https://www.quora.com/How-does-Qubole-improve-S3-performance

On Fri, Sep 18, 2015 at 2:24 AM, Sreenath  wrote:

> Hi All,
>
> Something interesting fell to my notice last day when i was using hive for
> some queries. The time taken by hive to launch a mapreduce job was
> manifolds higher than the time taken by hadoop to actually execute it.
> This is the table details on which the query is being fired.
>
> CREATE EXTERNAL TABLE A
> (
> user_id string,
> stage strig,
> url string
> )
> PARTITIONED BY (dt string , id string)
>
> All the data for table is stored in S3 and each day there will be around
> 2000 unique id i.e 2000 partitions being added daily. And we can assume
> that each partition has on a average 100MB gzip compressed data.
> Now when I run a query like "SELECT DISTINCT user_id FROM A  WHERE
> dt>='20150101' and dt <= '20150401'" ie over a period of 3 months approx
> 6 partitions it takes hive approximately 2 hrs to launch the map reduce
> job and the launched job just finishes in 20 min. So was wondering if
> someone can help me in understanding what hive is doing in this 2 hrs ?
> Would really appreciate some help here . Thanks in advance 
>
>
> Best,
> Sreenath
>


[jira] [Created] (HIVE-11888) LLAP: Merge master into branch (for HIVE-11860)

2015-09-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-11888:


 Summary: LLAP: Merge master into branch (for HIVE-11860)
 Key: HIVE-11888
 URL: https://issues.apache.org/jira/browse/HIVE-11888
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11887) spark tests break the build on a shared machine

2015-09-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-11887:
---

 Summary: spark tests break the build on a shared machine
 Key: HIVE-11887
 URL: https://issues.apache.org/jira/browse/HIVE-11887
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


Spark download creates UDFExampleAdd jar in /tmp; when building on a shared 
machine, someone else's jar from a build prevents this jar from being created 
(I have no permissions to this file because it was created by a different user) 
and the build fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11890) Create ORC module

2015-09-18 Thread Owen O'Malley (JIRA)
Owen O'Malley created HIVE-11890:


 Summary: Create ORC module
 Key: HIVE-11890
 URL: https://issues.apache.org/jira/browse/HIVE-11890
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley


Start moving classes over to the ORC module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11880) IndexOutOfBoundsException when execute query with filter condition on type incompatible column(A) on data(composed by UNION ALL when a union column is constant and it

2015-09-18 Thread WangMeng (JIRA)
WangMeng created HIVE-11880:
---

 Summary:IndexOutOfBoundsException when execute query with 
filter condition on type incompatible column(A) on data(composed by UNION ALL 
when a union column is constant and it has incompatible type with  
corresponding column) 
 Key: HIVE-11880
 URL: https://issues.apache.org/jira/browse/HIVE-11880
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.1
Reporter: WangMeng
Assignee: WangMeng


For Hive UNION ALL , when a union column is constant(column a) and it has 
incompatible type with the corresponding column A. The query with filter 
condition on type incompatible column a on this UNION-ALL results  will cause 
IndexOutOfBoundsException

such as TPC-H table orders:
CREATE VIEW `view_orders` AS select `oo`.`o_orderkey` , `oo`.`o_custkey`  from 
(  select  `orders`.`o_orderkey` , `rcfileorders`.`o_custkey` from 
`tpch270g`.`rcfileorders`   union all  select `orcfileorders`.`o_orderkey` , 0L 
as `o_custkey`   from  `tpch270g`.`textfileorders`) `oo`.

Type of "o_custkey" is INT,  the type of corresponding constant column 0 is 
BIGINT.
Then the fllowing query(with filter incompatible column 0_custkey)  will fail:
select count(1) from view_orders  where o_custkey<10 with  
java.lang.IndexOutOfBoundsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered in Hive

2015-09-18 Thread Ratandeep Ratti (JIRA)
Ratandeep Ratti created HIVE-11878:
--

 Summary: ClassNotFoundException can possibly  occur if multiple 
jars are registered in Hive
 Key: HIVE-11878
 URL: https://issues.apache.org/jira/browse/HIVE-11878
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.1
Reporter: Ratandeep Ratti
Assignee: Ratandeep Ratti


When we register a jar on the Hive console. Hive creates a fresh URL 
classloader which includes the path of the current jar to be registered and all 
the jar paths of the parent classloader. The parent classlaoder is the current 
ThreadContextClassLoader. Once the URLClassloader is created Hive sets that as 
the current ThreadContextClassloader.

So if we register multiple jars in Hive, there will be multiple URLClassLoaders 
created, each classloader including the jars from its parent and the one extra 
jar to be registered. The last URLClassLoader created will end up as the 
current ThreadContextClassLoader. (See details: 
org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)

Now here's an example in which the above strategy can lead to a CNF exception.
We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
*c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
the URLClassLoader *u1* is created and also set as the 
ThreadContextClassLoader. We register *j2* next, the new URLClassLoader created 
will be *u2* with *u1* as parent and *u2* becomes the new 
ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
whereas *u1* only has paths to *j1* (For details see: 
org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).

Now when we register class *c1* under a temporary function in Hive, we load the 
class using {code} class.forName("c1", true, 
Thread.currentThread().getContextClassLoader()) {code} . The 
currentThreadContext class-loader is *u2*, and it has the path to the class 
*c1*, but note that Class-loaders work by delegating to parent class-loader 
first. In this case class *c1* will be found and *defined* by class-loader *u1*.

Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
initialize) is called in *c1*, which references the class *c2*, *c2* will not 
be found since the class-loader used to search for *c2* will be *u1* (Since the 
caller's class-loader is used to load a class)


I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hive Start Up Time Manifolds Greater than Execution Time

2015-09-18 Thread Sreenath
Hi All,

Something interesting fell to my notice last day when i was using hive for
some queries. The time taken by hive to launch a mapreduce job was
manifolds higher than the time taken by hadoop to actually execute it.
This is the table details on which the query is being fired.

CREATE EXTERNAL TABLE A
(
user_id string,
stage strig,
url string
)
PARTITIONED BY (dt string , id string)

All the data for table is stored in S3 and each day there will be around
2000 unique id i.e 2000 partitions being added daily. And we can assume
that each partition has on a average 100MB gzip compressed data.
Now when I run a query like "SELECT DISTINCT user_id FROM A  WHERE
dt>='20150101' and dt <= '20150401'" ie over a period of 3 months approx
6 partitions it takes hive approximately 2 hrs to launch the map reduce
job and the launched job just finishes in 20 min. So was wondering if
someone can help me in understanding what hive is doing in this 2 hrs ?
Would really appreciate some help here . Thanks in advance 


Best,
Sreenath


Review Request 38493: HIVE-11132

2015-09-18 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38493/
---

Review request for hive and Gopal V.


Bugs: HIVE-11132
https://issues.apache.org/jira/browse/HIVE-11132


Repository: hive-git


Description
---

Queries using join and group by produce incorrect output when 
hive.auto.convert.join=false and hive.optimize.reducededuplication=true


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
 56334ed 
  ql/src/test/queries/clientpositive/join_grp_diff_keys.q PRE-CREATION 
  ql/src/test/results/clientpositive/join_grp_diff_keys.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/38493/diff/


Testing
---

New q test and regression suite.


Thanks,

Ashutosh Chauhan



[jira] [Created] (HIVE-11881) Supporting HPL/SQL Packages

2015-09-18 Thread Dmitry Tolpeko (JIRA)
Dmitry Tolpeko created HIVE-11881:
-

 Summary: Supporting HPL/SQL Packages
 Key: HIVE-11881
 URL: https://issues.apache.org/jira/browse/HIVE-11881
 Project: Hive
  Issue Type: Improvement
  Components: hpl/sql
Reporter: Dmitry Tolpeko
Assignee: Dmitry Tolpeko


HPL/SQL should support packages similar to Oracle PL/SQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11883) 'transactional' table property for ACID should be case insensitive

2015-09-18 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-11883:
-

 Summary: 'transactional' table property for ACID should be case 
insensitive
 Key: HIVE-11883
 URL: https://issues.apache.org/jira/browse/HIVE-11883
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


Given:
{noformat}
CREATE TABLE mytable (col1 int, col2 string)
CLUSTERED BY (col1) INTO 2 BUCKETS
STORED AS ORC TBLPROPERTIES('TRANSACTIONAL'='TRUE');
{noformat}

update/delete statements will fail with 
{noformat}
FAILED: SemanticException [Error 10122]: Bucketized tables do not support 
INSERT INTO: Table: default.mytable
{noformat}

but 'transactional' (in lower case) works fine



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11886) LLAP: merge master into branch

2015-09-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-11886:
---

 Summary: LLAP: merge master into branch
 Key: HIVE-11886
 URL: https://issues.apache.org/jira/browse/HIVE-11886
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11882) Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold

2015-09-18 Thread Illya Yalovyy (JIRA)
Illya Yalovyy created HIVE-11882:


 Summary: Fetch optimizer should stop source files traversal once 
it exceeds the hive.fetch.task.conversion.threshold
 Key: HIVE-11882
 URL: https://issues.apache.org/jira/browse/HIVE-11882
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer
Affects Versions: 1.0.0
Reporter: Illya Yalovyy


Hive 1.0's fetch optimizer tries to optimize queries of the form "select  
from  where  limit " to a fetch task (see the 
hive.fetch.task.conversion property). This optimization gets the lengths of all 
the files in the specified partition and does some comparison against a 
threshold value to determine whether it should use a fetch task or not (see the 
hive.fetch.task.conversion.threshold property). This process of getting the 
length of all files. One of the main problems in this optimization is the fetch 
optimizer doesn't seem to stop once it exceeds the 
hive.fetch.task.conversion.threshold. It works fine on HDFS, but could cause a 
significant performance degradation on other supported file systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)