[jira] [Commented] (HIVE-20867) Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by

2018-11-05 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675765#comment-16675765
 ] 

Pengcheng Xiong commented on HIVE-20867:


Thanks Gopal for the explanation. I can see the potential benefit of using left 
semi join over the existing implementation in some scenarios. If it is decided 
case-by-case, I think it may be better to add some cost-based metrics or a hive 
configuration on which the decision can be made. That is only my suggestion. 
You guys can decided what to do after all.  :)

> Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by
> -
>
> Key: HIVE-20867
> URL: https://issues.apache.org/jira/browse/HIVE-20867
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20867) Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by

2018-11-05 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675550#comment-16675550
 ] 

Pengcheng Xiong commented on HIVE-20867:


I have some questions about this jira. Could you share your design document on 
this? I assumed that we compared several candidates when we made the decision, 
and lefts semi join was one of them. We chose union-based one because a) a 
similar approach can be applied to except(all) as well, thus we have better 
code reuse. b) when we have more then 2 branches as the inputs of intersect, we 
assume that in the future those branches can be executed in parallel. Comparing 
with left-semi join one, we need to do the join one by one.

> Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by
> -
>
> Key: HIVE-20867
> URL: https://issues.apache.org/jira/browse/HIVE-20867
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20000) woooohoo20000ooooooo

2018-06-26 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524223#comment-16524223
 ] 

Pengcheng Xiong commented on HIVE-2:


congrats! :)

> whoo2ooo
> 
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Affects Versions: All Versions
>Reporter: Prasanth Jayachandran
>Assignee: Hive QA
>Priority: Blocker
> Fix For: All Versions
>
>
> {code:java}
>    :::  :::  :::  ::: 
> :+::+::+:   :+::+:   :+::+:   :+::+:   :+:
>   +:+ +:+  :+:++:+  :+:++:+  :+:++:+  :+:+
> +#+   +#+ + +:++#+ + +:++#+ + +:++#+ + +:+
>   +#+ +#+#  +#++#+#  +#++#+#  +#++#+#  +#+
>  #+#  #+#   #+##+#   #+##+#   #+##+#   #+#
> ## ###  ###  ###  ### 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16042) special characters in the comment of sql file cause ParseException

2018-05-08 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468144#comment-16468144
 ] 

Pengcheng Xiong commented on HIVE-16042:


hi [~jameszhouyi], as i said in the previous thread,  if you want to use 
comment, you should use "--" at the beginning of a line rather than in the 
middle of a line.

> special characters in the comment of sql file cause ParseException
> --
>
> Key: HIVE-16042
> URL: https://issues.apache.org/jira/browse/HIVE-16042
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
> Environment: Hive2.2 (commit: 2768361)
> TPCx-BB v1.2
>Reporter: KaiXu
>Priority: Major
> Attachments: q04.sql, q17.sql, q18.sql, q23.sql
>
>
> current Hive upstream(commit: 2768361) failed to parse some 
> queries(q04,q17,q18,q23) in TPCx-BB v1.2, while it's ok with Hive(commit: 
> ac68aed).
> Q04: FAILED: ParseException line 24:0 missing EOF at ';' near 
> 'abandonedShoppingCartsPageCountsPerSession'
> Q17:
> NoViableAltException(350@[])
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.limitClause(HiveParser.java:38898)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:37002)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:36404)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:35722)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:35610)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2279)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1328)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
> at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:75)
> at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:68)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:474)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:490)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 39:0 cannot recognize input near 'LIMIT' '100' 
> ';' in limit clause
> Q18:
> NoViableAltException(350@[()* loopback of 424:20: ( ( LSQUARE ^ expression 
> RSQUARE !) | ( DOT ^ identifier ) )*])
> at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
> at org.antlr.runtime.DFA.predict(DFA.java:116)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6665)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6992)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:7048)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7210)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7353)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7496)
>

[jira] [Commented] (HIVE-19059) Support DEFAULT keyword with INSERT and UPDATE

2018-03-26 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414858#comment-16414858
 ] 

Pengcheng Xiong commented on HIVE-19059:


"
|-> \{$expr.tree.getText() == "default"}?|

"

Maybe you should use .equals for string compare?

> Support DEFAULT keyword with INSERT and UPDATE
> --
>
> Key: HIVE-19059
> URL: https://issues.apache.org/jira/browse/HIVE-19059
> Project: Hive
>  Issue Type: New Feature
>  Components: SQL
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19059.1.patch
>
>
> Support DEFAULT keyword in INSERT e.g.
> {code:sql}
> INSERT INTO TABLE t values (DEFAULT, DEFAULT)
> {code}
> or with UPDATE
> {code:sql}
> UPDATE TABLE t SET col1=DEFAULT WHERE col2 > 4
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected

2018-01-05 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313674#comment-16313674
 ] 

Pengcheng Xiong commented on HIVE-18375:


[~pauljackson123], i see. But all of the 4 queries involve ORDER BY if 
"involve" means we have ORDER BY in the query text. For those queries, they 
should be runnable on current Hive master as it contains HIVE-15160, which 
enables "order by non-selected column". The reason why you can not run that on 
your cluster is because  HIVE-15160 is not in any release (including what you 
are using) yet. I think you may need to wait until the next release that 
include this patch. Thanks.

> Cannot ORDER by subquery fields unless they are selected
> 
>
> Key: HIVE-18375
> URL: https://issues.apache.org/jira/browse/HIVE-18375
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.2
> Environment: Amazon AWS
> Release label:emr-5.11.0
> Hadoop distribution:Amazon 2.7.3
> Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1
> classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict]
>Reporter: Paul Jackson
>Priority: Minor
>
> Give these tables:
> {code:SQL}
> CREATE TABLE employees (
> emp_no  INT,
> first_name  VARCHAR(14),
> last_name   VARCHAR(16)
> );
> insert into employees values
> (1, 'Gottlob', 'Frege'),
> (2, 'Bertrand', 'Russell'),
> (3, 'Ludwig', 'Wittgenstein');
> CREATE TABLE salaries (
> emp_no  INT,
> salary  INT,
> from_date   DATE,
> to_date DATE
> );
> insert into salaries values
> (1, 10, '1900-01-01', '1900-01-31'),
> (1, 18, '1900-09-01', '1900-09-30'),
> (2, 15, '1940-03-01', '1950-01-01'),
> (3, 20, '1920-01-01', '1950-01-01');
> {code}
> This query returns the names of the employees ordered by their peak salary:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> However, this should still work even if the max_salary is not part of the 
> projection:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> However, that fails with this error:
> {code}
> Error while compiling statement: FAILED: SemanticException [Error 10004]: 
> line 9:9 Invalid table alias or column reference 't1': (possible column names 
> are: last_name, first_name)
> {code}
> FWIW, this also fails:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` 
> AS `max_sal`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> But this succeeds:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` 
> AS `max_sal`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `max_sal` DESC;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18359) Extend grouping set limits from int to long

2018-01-04 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312461#comment-16312461
 ] 

Pengcheng Xiong commented on HIVE-18359:


LGTM +1 pending tests.  :)

> Extend grouping set limits from int to long
> ---
>
> Key: HIVE-18359
> URL: https://issues.apache.org/jira/browse/HIVE-18359
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18359.1.patch, HIVE-18359.2.patch
>
>
> Grouping sets is broken for >32 columns because of usage of Int for bitmap 
> (also GROUPING__ID virtual column). This assumption breaks grouping 
> sets/rollups/cube when number of participating aggregation columns is >32. 
> The easier fix would be extend it to Long for now. The correct fix would be 
> to use BitSets everywhere but that would require GROUPING__ID column type to 
> binary which will make predicates on GROUPING__ID difficult to deal with. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected

2018-01-04 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312371#comment-16312371
 ] 

Pengcheng Xiong commented on HIVE-18375:


[~pauljackson123], i am sorry but i saw that all of your above cases involve 
ORDER BY. Which simpler issue do you mean?

> Cannot ORDER by subquery fields unless they are selected
> 
>
> Key: HIVE-18375
> URL: https://issues.apache.org/jira/browse/HIVE-18375
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.2
> Environment: Amazon AWS
> Release label:emr-5.11.0
> Hadoop distribution:Amazon 2.7.3
> Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1
> classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict]
>Reporter: Paul Jackson
>Priority: Minor
>
> Give these tables:
> {code:SQL}
> CREATE TABLE employees (
> emp_no  INT,
> first_name  VARCHAR(14),
> last_name   VARCHAR(16)
> );
> insert into employees values
> (1, 'Gottlob', 'Frege'),
> (2, 'Bertrand', 'Russell'),
> (3, 'Ludwig', 'Wittgenstein');
> CREATE TABLE salaries (
> emp_no  INT,
> salary  INT,
> from_date   DATE,
> to_date DATE
> );
> insert into salaries values
> (1, 10, '1900-01-01', '1900-01-31'),
> (1, 18, '1900-09-01', '1900-09-30'),
> (2, 15, '1940-03-01', '1950-01-01'),
> (3, 20, '1920-01-01', '1950-01-01');
> {code}
> This query returns the names of the employees ordered by their peak salary:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> However, this should still work even if the max_salary is not part of the 
> projection:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> However, that fails with this error:
> {code}
> Error while compiling statement: FAILED: SemanticException [Error 10004]: 
> line 9:9 Invalid table alias or column reference 't1': (possible column names 
> are: last_name, first_name)
> {code}
> FWIW, this also fails:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` 
> AS `max_sal`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> But this succeeds:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` 
> AS `max_sal`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `max_sal` DESC;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected

2018-01-04 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312344#comment-16312344
 ] 

Pengcheng Xiong commented on HIVE-18375:


[~pauljackson123], if possible, could u try Hive master? As this is a new 
feature in HIVE-15160 targeting version 3.0, I doubt it is available in any 
published version yet.

> Cannot ORDER by subquery fields unless they are selected
> 
>
> Key: HIVE-18375
> URL: https://issues.apache.org/jira/browse/HIVE-18375
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.2
> Environment: Amazon AWS
> Release label:emr-5.11.0
> Hadoop distribution:Amazon 2.7.3
> Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1
> classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict]
>Reporter: Paul Jackson
>Priority: Minor
>
> Give these tables:
> {code:SQL}
> CREATE TABLE employees (
> emp_no  INT,
> first_name  VARCHAR(14),
> last_name   VARCHAR(16)
> );
> insert into employees values
> (1, 'Gottlob', 'Frege'),
> (2, 'Bertrand', 'Russell'),
> (3, 'Ludwig', 'Wittgenstein');
> CREATE TABLE salaries (
> emp_no  INT,
> salary  INT,
> from_date   DATE,
> to_date DATE
> );
> insert into salaries values
> (1, 10, '1900-01-01', '1900-01-31'),
> (1, 18, '1900-09-01', '1900-09-30'),
> (2, 15, '1940-03-01', '1950-01-01'),
> (3, 20, '1920-01-01', '1950-01-01');
> {code}
> This query returns the names of the employees ordered by their peak salary:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> However, this should still work even if the max_salary is not part of the 
> projection:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> However, that fails with this error:
> {code}
> Error while compiling statement: FAILED: SemanticException [Error 10004]: 
> line 9:9 Invalid table alias or column reference 't1': (possible column names 
> are: last_name, first_name)
> {code}
> FWIW, this also fails:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` 
> AS `max_sal`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> But this succeeds:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` 
> AS `max_sal`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `max_sal` DESC;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected

2018-01-04 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312255#comment-16312255
 ] 

Pengcheng Xiong commented on HIVE-18375:


May be related to HIVE-15160.

> Cannot ORDER by subquery fields unless they are selected
> 
>
> Key: HIVE-18375
> URL: https://issues.apache.org/jira/browse/HIVE-18375
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.2
> Environment: Amazon AWS
> Release label:emr-5.11.0
> Hadoop distribution:Amazon 2.7.3
> Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1
> classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict]
>Reporter: Paul Jackson
>Priority: Minor
>
> Give these tables:
> {code:SQL}
> CREATE TABLE employees (
> emp_no  INT,
> first_name  VARCHAR(14),
> last_name   VARCHAR(16)
> );
> insert into employees values
> (1, 'Gottlob', 'Frege'),
> (2, 'Bertrand', 'Russell'),
> (3, 'Ludwig', 'Wittgenstein');
> CREATE TABLE salaries (
> emp_no  INT,
> salary  INT,
> from_date   DATE,
> to_date DATE
> );
> insert into salaries values
> (1, 10, '1900-01-01', '1900-01-31'),
> (1, 18, '1900-09-01', '1900-09-30'),
> (2, 15, '1940-03-01', '1950-01-01'),
> (3, 20, '1920-01-01', '1950-01-01');
> {code}
> This query returns the names of the employees ordered by their peak salary:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> However, this should still work even if the max_salary is not part of the 
> projection:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> However, that fails with this error:
> {code}
> Error while compiling statement: FAILED: SemanticException [Error 10004]: 
> line 9:9 Invalid table alias or column reference 't1': (possible column names 
> are: last_name, first_name)
> {code}
> FWIW, this also fails:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` 
> AS `max_sal`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `t1`.`max_salary` DESC;
> {code}
> But this succeeds:
> {code:SQL}
> SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` 
> AS `max_sal`
> FROM `default`.`employees`
> INNER JOIN
>  (SELECT `emp_no`, MAX(`salary`) `max_salary`
>   FROM `default`.`salaries`
>   WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
>   GROUP BY `emp_no`) AS `t1`
> ON `employees`.`emp_no` = `t1`.`emp_no`
> ORDER BY `max_sal` DESC;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18144) Runtime type inference error when join three table for different column type

2017-12-02 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16275695#comment-16275695
 ] 

Pengcheng Xiong commented on HIVE-18144:


[~wanghaihua], can you add "set hive.cbo.enable=false" in your ptest file and 
test again?

> Runtime type inference error when join three table for different column type 
> -
>
> Key: HIVE-18144
> URL: https://issues.apache.org/jira/browse/HIVE-18144
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Wang Haihua
>Assignee: Wang Haihua
> Attachments: HIVE-18144.1.patch
>
>
> Union operation with three or more table, which has different column types, 
> may cause type inference error when Task execution.
> E.g, e.g. t1(with column int) union all t2(with column int) union all t3(with 
> column bigint), finally should be {{bigint}},
> RowSchema of union t1 with t2, we call {{leftOp}}, should be int, then leftOp 
> union t3 should finally be bigint.
> This mean RowSchema of leftOp would be {{bigint}} instead of {{int}}
> However we see in {{SemanticAnalyzer.java}}, leftOp RowSchema is finally 
> {{int}} which was wrong: 
> {code}
> (_col0: int|{t01-subquery1}diff_long_type,_col1: 
> int|{t01-subquery1}id2,_col2: bigint|{t01-subquery1}id3)}}
> {code}
> Impacted code  in SemanticAnalyzer.java:
> {code}
>   if(!(leftOp instanceof UnionOperator)) {
> Operator oldChild = leftOp;
> leftOp = (Operator) leftOp.getParentOperators().get(0);
> leftOp.removeChildAndAdoptItsChildren(oldChild);
>   }
>   // make left a child of right
>   List child =
>   new ArrayList();
>   child.add(leftOp);
>   rightOp.setChildOperators(child);
>   List parent = leftOp
>   .getParentOperators();
>   parent.add(rightOp);
>   UnionDesc uDesc = ((UnionOperator) leftOp).getConf();
>   // Here we should set RowSchema of leftOp to unionoutRR's, or else the 
> RowSchema of leftOp is wrong.
>   // leftOp.setSchema(new RowSchema(unionoutRR.getColumnInfos()));
>   uDesc.setNumInputs(uDesc.getNumInputs() + 1);
>   return putOpInsertMap(leftOp, unionoutRR);
> {code}
> Operation for reproduce:
> {code}
> create table test_union_different_type(id bigint, id2 bigint, id3 bigint, 
> name string);
> set hive.auto.convert.join=true;
> insert overwrite table test_union_different_type select 1, 2, 3, 
> "test_union_different_type";
> select
>   t01.diff_long_type as diff_long_type,
>   t01.id2 as id2,
>   t00.id as id,
>   t01.id3 as id3
> from test_union_different_type t00
> left join
>   (
> select 1 as diff_long_type, 30 as id2, id3 from test_union_different_type
> union ALL
> select 2 as diff_long_type, 20 as id2, id3 from test_union_different_type
> union ALL
> select id as diff_long_type, id2, 30 as id3 from test_union_different_type
>   ) t01
> on t00.id = t01.diff_long_type
> ;
> {code}
> Stack trace:
> {code}
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"id":1,"id2":null,"id3":null,"name":null}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : org.apache.hadoop.io.LongWritable cannot be 
> cast to org.apache.hadoop.io.IntWritable
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:465)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
>   at 

[jira] [Commented] (HIVE-18144) Runtime type inference error when join three table for different column type

2017-11-30 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273968#comment-16273968
 ] 

Pengcheng Xiong commented on HIVE-18144:


LGTM. I saw you put 2.1 and 2.2 as affected versions. How about master?

> Runtime type inference error when join three table for different column type 
> -
>
> Key: HIVE-18144
> URL: https://issues.apache.org/jira/browse/HIVE-18144
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Wang Haihua
>Assignee: Wang Haihua
> Attachments: HIVE-18144.1.patch
>
>
> Union operation with three or more table, which has different column types, 
> may cause type inference error when Task execution.
> E.g, e.g. t1(with column int) union all t2(with column int) union all t3(with 
> column bigint), finally should be {{bigint}},
> RowSchema of union t1 with t2, we call {{leftOp}}, should be int, then leftOp 
> union t3 should finally be bigint.
> This mean RowSchema of leftOp would be {{bigint}} instead of {{int}}
> However we see in {{SemanticAnalyzer.java}}, leftOp RowSchema is finally 
> {{int}} which was wrong: 
> {code}
> (_col0: int|{t01-subquery1}diff_long_type,_col1: 
> int|{t01-subquery1}id2,_col2: bigint|{t01-subquery1}id3)}}
> {code}
> Impacted code  in SemanticAnalyzer.java:
> {code}
>   if(!(leftOp instanceof UnionOperator)) {
> Operator oldChild = leftOp;
> leftOp = (Operator) leftOp.getParentOperators().get(0);
> leftOp.removeChildAndAdoptItsChildren(oldChild);
>   }
>   // make left a child of right
>   List child =
>   new ArrayList();
>   child.add(leftOp);
>   rightOp.setChildOperators(child);
>   List parent = leftOp
>   .getParentOperators();
>   parent.add(rightOp);
>   UnionDesc uDesc = ((UnionOperator) leftOp).getConf();
>   // Here we should set RowSchema of leftOp to unionoutRR's, or else the 
> RowSchema of leftOp is wrong.
>   // leftOp.setSchema(new RowSchema(unionoutRR.getColumnInfos()));
>   uDesc.setNumInputs(uDesc.getNumInputs() + 1);
>   return putOpInsertMap(leftOp, unionoutRR);
> {code}
> Operation for reproduce:
> {code}
> create table test_union_different_type(id bigint, id2 bigint, id3 bigint, 
> name string);
> set hive.auto.convert.join=true;
> insert overwrite table test_union_different_type select 1, 2, 3, 
> "test_union_different_type";
> select
>   t01.diff_long_type as diff_long_type,
>   t01.id2 as id2,
>   t00.id as id,
>   t01.id3 as id3
> from test_union_different_type t00
> left join
>   (
> select 1 as diff_long_type, 30 as id2, id3 from test_union_different_type
> union ALL
> select 2 as diff_long_type, 20 as id2, id3 from test_union_different_type
> union ALL
> select id as diff_long_type, id2, 30 as id3 from test_union_different_type
>   ) t01
> on t00.id = t01.diff_long_type
> ;
> {code}
> Stack trace:
> {code}
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"id":1,"id2":null,"id3":null,"name":null}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : org.apache.hadoop.io.LongWritable cannot be 
> cast to org.apache.hadoop.io.IntWritable
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:465)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
>   at 

[jira] [Updated] (HIVE-17185) TestHiveMetaStoreStatsMerge.testStatsMerge is failing

2017-07-30 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17185:
---
Fix Version/s: 3.0.0

> TestHiveMetaStoreStatsMerge.testStatsMerge is failing
> -
>
> Key: HIVE-17185
> URL: https://issues.apache.org/jira/browse/HIVE-17185
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Test
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
>
> Likely because of HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17185) TestHiveMetaStoreStatsMerge.testStatsMerge is failing

2017-07-30 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-17185.

Resolution: Fixed

updated tests.

> TestHiveMetaStoreStatsMerge.testStatsMerge is failing
> -
>
> Key: HIVE-17185
> URL: https://issues.apache.org/jira/browse/HIVE-17185
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Test
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
>
> Likely because of HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17185) TestHiveMetaStoreStatsMerge.testStatsMerge is failing

2017-07-30 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-17185:
--

Assignee: Pengcheng Xiong

> TestHiveMetaStoreStatsMerge.testStatsMerge is failing
> -
>
> Key: HIVE-17185
> URL: https://issues.apache.org/jira/browse/HIVE-17185
> Project: Hive
>  Issue Type: Test
>  Components: Metastore, Test
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
>
> Likely because of HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17185) TestHiveMetaStoreStatsMerge.testStatsMerge is failing

2017-07-30 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17185:
---
Issue Type: Sub-task  (was: Test)
Parent: HIVE-16995

> TestHiveMetaStoreStatsMerge.testStatsMerge is failing
> -
>
> Key: HIVE-17185
> URL: https://issues.apache.org/jira/browse/HIVE-17185
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Test
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
>
> Likely because of HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables

2017-07-29 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106314#comment-16106314
 ] 

Pengcheng Xiong commented on HIVE-17190:


sounds like we did not have bitvectors for perfclidriver when we load tables...

> Schema changes for bitvectors for unpartitioned tables
> --
>
> Key: HIVE-17190
> URL: https://issues.apache.org/jira/browse/HIVE-17190
> Project: Hive
>  Issue Type: Test
>  Components: Metastore, Statistics
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-17190.2.patch
>
>
> Missed in HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17175) Improve desc formatted for bitvectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-17175:
--


> Improve desc formatted for bitvectors
> -
>
> Key: HIVE-17175
> URL: https://issues.apache.org/jira/browse/HIVE-17175
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store and use bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Extend object store to store and use bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store and use bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Affects Version/s: 3.0.0

> Extend object store to store and use bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store and use bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Fix Version/s: 3.0.0

> Extend object store to store and use bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16997) Extend object store to store and use bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100904#comment-16100904
 ] 

Pengcheng Xiong commented on HIVE-16997:


I ran the failed tests on local machine and update accordingly. Note that, we 
do not use bitvector for spark as we did not change the xml file. I pushed 
patch to master. thanks [~ashutoshc] for the review. ccing [~hagleitn], this 
should enable the partition stats merging. Thanks.

> Extend object store to store and use bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store and use bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Summary: Extend object store to store and use bit vectors  (was: Extend 
object store to store bit vectors)

> Extend object store to store and use bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: HIVE-16997.06.patch

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: (was: HIVE-16997.06.patch)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Open  (was: Patch Available)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Patch Available  (was: Open)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: HIVE-16997.06.patch

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Open  (was: Patch Available)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Patch Available  (was: Open)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Open  (was: Patch Available)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Patch Available  (was: Open)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: HIVE-16997.05.patch

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16640) The ASF Headers have some errors in some class

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16640:
---
Fix Version/s: 2.3.0

> The ASF Headers have some errors in some class
> --
>
> Key: HIVE-16640
> URL: https://issues.apache.org/jira/browse/HIVE-16640
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16640.1.patch
>
>
> I found some class license hive placed in an incorrect location, some classes 
> missing license



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Open  (was: Patch Available)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: HIVE-16997.04.patch

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Patch Available  (was: Open)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16997) Extend object store to store bit vectors

2017-07-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097514#comment-16097514
 ] 

Pengcheng Xiong commented on HIVE-16997:


[~ashutoshc], it seems that it is good to go. One thing that i haven't address 
is the blob type. it involves change of thrift, MTableColumnStats, 
GenericUDAFComputeStats, SERDE of HLL and FM. thanks.

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Open  (was: Patch Available)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Patch Available  (was: Open)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: HIVE-16997.03.patch

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: (was: HIVE-16997.03.patch)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Open  (was: Patch Available)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: HIVE-16997.03.patch

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Patch Available  (was: Open)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17137) Fix javolution conflict

2017-07-21 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17137:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Fix javolution conflict
> ---
>
> Key: HIVE-17137
> URL: https://issues.apache.org/jira/browse/HIVE-17137
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: HIVE-17137.01.patch
>
>
> as reported by [~jcamachorodriguez]
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-exec:jar:3.0.0-SNAPSHOT
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: javolution:javolution:jar -> duplicate declaration of version 
> ${javolution.version} @ org.apache.hive:hive-exec:[unknown-version], 
> /grid/5/dev/jcamachorodriguez/dist/tez-autobuild/hive/ql/pom.xml, line 366, 
> column 17
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17137) Fix javolution conflict

2017-07-21 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17137:
---
Fix Version/s: 3.0.0

> Fix javolution conflict
> ---
>
> Key: HIVE-17137
> URL: https://issues.apache.org/jira/browse/HIVE-17137
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: HIVE-17137.01.patch
>
>
> as reported by [~jcamachorodriguez]
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-exec:jar:3.0.0-SNAPSHOT
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: javolution:javolution:jar -> duplicate declaration of version 
> ${javolution.version} @ org.apache.hive:hive-exec:[unknown-version], 
> /grid/5/dev/jcamachorodriguez/dist/tez-autobuild/hive/ql/pom.xml, line 366, 
> column 17
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17137) Fix javolution conflict

2017-07-21 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17137:
---
Affects Version/s: 3.0.0

> Fix javolution conflict
> ---
>
> Key: HIVE-17137
> URL: https://issues.apache.org/jira/browse/HIVE-17137
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: HIVE-17137.01.patch
>
>
> as reported by [~jcamachorodriguez]
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-exec:jar:3.0.0-SNAPSHOT
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: javolution:javolution:jar -> duplicate declaration of version 
> ${javolution.version} @ org.apache.hive:hive-exec:[unknown-version], 
> /grid/5/dev/jcamachorodriguez/dist/tez-autobuild/hive/ql/pom.xml, line 366, 
> column 17
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-13125) Support masking and filtering of rows/columns

2017-07-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13125:
---
Attachment: ColumnMaskingInsertDesign.docx

> Support masking and filtering of rows/columns
> -
>
> Key: HIVE-13125
> URL: https://issues.apache.org/jira/browse/HIVE-13125
> Project: Hive
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: ColumnMaskingInsertDesign.docx, HIVE-13125.01.patch, 
> HIVE-13125.02.patch, HIVE-13125.03.patch, HIVE-13125.04.patch, 
> HIVE-13125.final.patch
>
>
> Traditionally, access control at the row and column level is implemented 
> through views. Using views as an access control method works well only when 
> access rules, restrictions, and conditions are monolithic and simple. It 
> however becomes ineffective when view definitions become too complex because 
> of the complexity and granularity of privacy and security policies. It also 
> becomes costly when a large number of views must be manually updated and 
> maintained. In addition, the ability to update views proves to be 
> challenging. As privacy and security policies evolve, required updates to 
> views may negatively affect the security logic particularly when database 
> applications reference the views directly by name. HIVE row and column access 
> control helps resolve all these problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Patch Available  (was: Open)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Open  (was: Patch Available)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: HIVE-16997.02.patch

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-07-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17137) Fix javolution conflict

2017-07-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17137:
---
Status: Patch Available  (was: Open)

> Fix javolution conflict
> ---
>
> Key: HIVE-17137
> URL: https://issues.apache.org/jira/browse/HIVE-17137
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-17137.01.patch
>
>
> as reported by [~jcamachorodriguez]
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-exec:jar:3.0.0-SNAPSHOT
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: javolution:javolution:jar -> duplicate declaration of version 
> ${javolution.version} @ org.apache.hive:hive-exec:[unknown-version], 
> /grid/5/dev/jcamachorodriguez/dist/tez-autobuild/hive/ql/pom.xml, line 366, 
> column 17
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17137) Fix javolution conflict

2017-07-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17137:
---
Attachment: HIVE-17137.01.patch

> Fix javolution conflict
> ---
>
> Key: HIVE-17137
> URL: https://issues.apache.org/jira/browse/HIVE-17137
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-17137.01.patch
>
>
> as reported by [~jcamachorodriguez]
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-exec:jar:3.0.0-SNAPSHOT
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: javolution:javolution:jar -> duplicate declaration of version 
> ${javolution.version} @ org.apache.hive:hive-exec:[unknown-version], 
> /grid/5/dev/jcamachorodriguez/dist/tez-autobuild/hive/ql/pom.xml, line 366, 
> column 17
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17137) Fix javolution conflict

2017-07-20 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095065#comment-16095065
 ] 

Pengcheng Xiong commented on HIVE-17137:


[~jcamachorodriguez], could u review the patch? thanks!

> Fix javolution conflict
> ---
>
> Key: HIVE-17137
> URL: https://issues.apache.org/jira/browse/HIVE-17137
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-17137.01.patch
>
>
> as reported by [~jcamachorodriguez]
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-exec:jar:3.0.0-SNAPSHOT
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: javolution:javolution:jar -> duplicate declaration of version 
> ${javolution.version} @ org.apache.hive:hive-exec:[unknown-version], 
> /grid/5/dev/jcamachorodriguez/dist/tez-autobuild/hive/ql/pom.xml, line 366, 
> column 17
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17137) Fix javolution conflict

2017-07-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-17137:
--


> Fix javolution conflict
> ---
>
> Key: HIVE-17137
> URL: https://issues.apache.org/jira/browse/HIVE-17137
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> as reported by [~jcamachorodriguez]
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-exec:jar:3.0.0-SNAPSHOT
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: javolution:javolution:jar -> duplicate declaration of version 
> ${javolution.version} @ org.apache.hive:hive-exec:[unknown-version], 
> /grid/5/dev/jcamachorodriguez/dist/tez-autobuild/hive/ql/pom.xml, line 366, 
> column 17
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-20 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095044#comment-16095044
 ] 

Pengcheng Xiong commented on HIVE-16996:


yes, i also saw that just now, I think it is due to my problem. I will take a 
look and put a patch there. thanks for discovering this!

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch, HIVE-16966.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Description: This patch includes: (1) a new serde for FMSketch (2) change 
of schema for derby and mysql (3) support for date type (4) refactoring the 
extrapolation and merge code

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Status: Patch Available  (was: Open)

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16997) Extend object store to store bit vectors

2017-07-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16997:
---
Attachment: HIVE-16997.01.patch

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16997.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17119) Improve error message thrown by subqueries if there are no statistics

2017-07-18 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092235#comment-16092235
 ] 

Pengcheng Xiong edited comment on HIVE-17119 at 7/18/17 9:42 PM:
-

Thanks [~vgarg], i think previously [~rusanu] has a patch for popping out 
warning messages for missing stats. But that is not enough.


was (Author: pxiong):
Thanks [~vgarg], i think previously [~rusanu] has a patch for popping out 
warning messages for missing stats.

> Improve error message thrown by subqueries if there are no statistics
> -
>
> Key: HIVE-17119
> URL: https://issues.apache.org/jira/browse/HIVE-17119
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> Currently if there are no stats hive bails out of cbo and try non-cbo route. 
> Since most of the subqueries are only supported in CBO and not in non-cbo we 
> end up with completely different error message. This creates confusion.
> One possible solution is to throw warnings that stats are missing before 
> throwing subquery related error message.
> cc [~pxiong]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17119) Improve error message thrown by subqueries if there are no statistics

2017-07-18 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092235#comment-16092235
 ] 

Pengcheng Xiong commented on HIVE-17119:


Thanks [~vgarg], i think previously [~rusanu] has a patch for popping out 
warning messages for missing stats.

> Improve error message thrown by subqueries if there are no statistics
> -
>
> Key: HIVE-17119
> URL: https://issues.apache.org/jira/browse/HIVE-17119
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> Currently if there are no stats hive bails out of cbo and try non-cbo route. 
> Since most of the subqueries are only supported in CBO and not in non-cbo we 
> end up with completely different error message. This creates confusion.
> One possible solution is to throw warnings that stats are missing before 
> throwing subquery related error message.
> cc [~pxiong]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16997) Extend object store to store bit vectors

2017-07-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088762#comment-16088762
 ] 

Pengcheng Xiong commented on HIVE-16997:


after transfering bit vector in FMsketch to 4bytes, we need 1024*4=4196bytes 
for FM sketch.

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088685#comment-16088685
 ] 

Pengcheng Xiong commented on HIVE-16996:


updated related q file changes, pushed to master. thanks [~ashutoshc] and 
[~prasanth_j] for the review.

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch, HIVE-16966.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Fix Version/s: 3.0.0

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch, HIVE-16966.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 3.0.0
>
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch, HIVE-16966.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15758) Allow correlated scalar subqueries with aggregates which has non-equi join predicates

2017-07-14 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088127#comment-16088127
 ] 

Pengcheng Xiong commented on HIVE-15758:


thanks [~vgarg] for the reply. However, i am still not convinced that without 
matching the row_id, how can the rewrite work. Let's wait for the QA results. 
Btw, could u try the example that I give in the previous comments? Thanks.

> Allow correlated scalar subqueries with aggregates which has non-equi join 
> predicates
> -
>
> Key: HIVE-15758
> URL: https://issues.apache.org/jira/browse/HIVE-15758
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15758.1.patch, HIVE-15758.2.patch
>
>
> Queries such as 
> {code} select * from part where p_size <> (select count(p_size) from part pp 
> where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE 
> doesn't know how to rewrite such queries to preserve the correctness for 
> cases when there is zero row



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Patch Available  (was: Open)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch, HIVE-16966.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Open  (was: Patch Available)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch, HIVE-16966.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Attachment: HIVE-16966.07.patch

use treemap, add test cases as per reviewers' comments.

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch, HIVE-16966.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17096) Fix test failures in 2.3 branch

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-17096.

Resolution: Fixed

> Fix test failures in 2.3 branch
> ---
>
> Key: HIVE-17096
> URL: https://issues.apache.org/jira/browse/HIVE-17096
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17096) Fix test failures in 2.3 branch

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086872#comment-16086872
 ] 

Pengcheng Xiong commented on HIVE-17096:


update the q files.

> Fix test failures in 2.3 branch
> ---
>
> Key: HIVE-17096
> URL: https://issues.apache.org/jira/browse/HIVE-17096
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17096) Fix test failures in 2.3 branch

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-17096:
--


> Fix test failures in 2.3 branch
> ---
>
> Key: HIVE-17096
> URL: https://issues.apache.org/jira/browse/HIVE-17096
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16997) Extend object store to store bit vectors

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086695#comment-16086695
 ] 

Pengcheng Xiong commented on HIVE-16997:


HLL dense register: // 2^p number of bytes for register, default p=14. that is, 
16384 bytes

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16997) Extend object store to store bit vectors

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086653#comment-16086653
 ] 

Pengcheng Xiong commented on HIVE-16997:


FM Sketch: max number of bit vector 1024 (1k), for each bit vector, it is like 
{0,1,2,31}, the length is 87. Thus, we need 87k for the worst case of FM 
Sketch.

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086647#comment-16086647
 ] 

Pengcheng Xiong commented on HIVE-16907:


[~libing], the following jira may be helpful:
{code}
commit c23841e553cbd4f32d33842d49f9b9e52803d143
Author: Pengcheng Xiong 
Date:   Sun Oct 4 12:45:21 2015 -0700

HIVE-11699: Support special characters in quoted table names (Pengcheng 
Xiong, reviewed by John Pullokkaran)
{code}

>  "INSERT INTO"  overwrite old data when destination table encapsulated by 
> backquote 
> 
>
> Key: HIVE-16907
> URL: https://issues.apache.org/jira/browse/HIVE-16907
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0, 2.1.1
>Reporter: Nemon Lou
>Assignee: Bing Li
> Attachments: HIVE-16907.1.patch
>
>
> A way to reproduce:
> {noformat}
> create database tdb;
> use tdb;
> create table t1(id int);
> create table t2(id int);
> explain insert into `tdb.t1` select * from t2;
> {noformat}
> {noformat}
> +---+
> |  
> Explain  |
> +---+
> | STAGE DEPENDENCIES: 
>   |
> |   Stage-1 is a root stage   
>   |
> |   Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, 
> Stage-4  |
> |   Stage-3   
>   |
> |   Stage-0 depends on stages: Stage-3, Stage-2, Stage-5  
>   |
> |   Stage-2   
>   |
> |   Stage-4   
>   |
> |   Stage-5 depends on stages: Stage-4
>   |
> | 
>   |
> | STAGE PLANS:
>   |
> |   Stage: Stage-1
>   |
> | Map Reduce  
>   |
> |   Map Operator Tree:
>   |
> |   TableScan 
>   |
> | alias: t2   
>   |
> | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE |
> | Select Operator 
>   |
> |   expressions: id (type: int)   
>   |
> |   outputColumnNames: _col0  
>   |
> |   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE   |
> |   File Output Operator  
>   |
> | 

[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Open  (was: Patch Available)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Patch Available  (was: Open)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Attachment: HIVE-16966.06.patch

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15758) Allow correlated scalar subqueries with aggregates which has non-equi join predicates

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086622#comment-16086622
 ] 

Pengcheng Xiong commented on HIVE-15758:


[~vgarg], could u explain briefly how hive matches the rows between inner and 
outer queries? thanks.

> Allow correlated scalar subqueries with aggregates which has non-equi join 
> predicates
> -
>
> Key: HIVE-15758
> URL: https://issues.apache.org/jira/browse/HIVE-15758
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15758.1.patch
>
>
> Queries such as 
> {code} select * from part where p_size <> (select count(p_size) from part pp 
> where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE 
> doesn't know how to rewrite such queries to preserve the correctness for 
> cases when there is zero row



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Open  (was: Patch Available)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17071) Make hive 2.3 depend on storage-api-2.4

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17071:
---
Summary: Make hive 2.3 depend on storage-api-2.4  (was: Make hive 2.3 
depend on storage-api-2.3)

> Make hive 2.3 depend on storage-api-2.4
> ---
>
> Key: HIVE-17071
> URL: https://issues.apache.org/jira/browse/HIVE-17071
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.3.0
>
> Attachments: HIVE-17071-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17071) Make hive 2.3 depend on storage-api-2.4

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17071:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Make hive 2.3 depend on storage-api-2.4
> ---
>
> Key: HIVE-17071
> URL: https://issues.apache.org/jira/browse/HIVE-17071
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.3.0
>
> Attachments: HIVE-17071-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Patch Available  (was: Open)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Open  (was: Patch Available)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Patch Available  (was: Open)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Attachment: HIVE-16966.05.patch

rebase to master

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Patch Available  (was: Open)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Attachment: (was: HIVE-16966.04.patch)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Open  (was: Patch Available)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Attachment: HIVE-16966.04.patch

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote

2017-07-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084614#comment-16084614
 ] 

Pengcheng Xiong edited comment on HIVE-16907 at 7/12/17 8:18 PM:
-

That is exactly what I am worrying about : Hive may not well support table name 
with ".". Could u estimate the work that we need to do if we want to support 
this? Thanks.


was (Author: pxiong):
That is exactly what I am worrying about : Hive may not well support table name 
with ".". Could u evaluate the work that we need to do if we want to support 
this? Thanks.

>  "INSERT INTO"  overwrite old data when destination table encapsulated by 
> backquote 
> 
>
> Key: HIVE-16907
> URL: https://issues.apache.org/jira/browse/HIVE-16907
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0, 2.1.1
>Reporter: Nemon Lou
>Assignee: Bing Li
> Attachments: HIVE-16907.1.patch
>
>
> A way to reproduce:
> {noformat}
> create database tdb;
> use tdb;
> create table t1(id int);
> create table t2(id int);
> explain insert into `tdb.t1` select * from t2;
> {noformat}
> {noformat}
> +---+
> |  
> Explain  |
> +---+
> | STAGE DEPENDENCIES: 
>   |
> |   Stage-1 is a root stage   
>   |
> |   Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, 
> Stage-4  |
> |   Stage-3   
>   |
> |   Stage-0 depends on stages: Stage-3, Stage-2, Stage-5  
>   |
> |   Stage-2   
>   |
> |   Stage-4   
>   |
> |   Stage-5 depends on stages: Stage-4
>   |
> | 
>   |
> | STAGE PLANS:
>   |
> |   Stage: Stage-1
>   |
> | Map Reduce  
>   |
> |   Map Operator Tree:
>   |
> |   TableScan 
>   |
> | alias: t2   
>   |
> | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE |
> | Select Operator 
>   |
> |   expressions: id (type: int)   
>   |
> |   outputColumnNames: _col0  
>   |
> |   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE   |
> |   File Output Operator   

[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote

2017-07-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084614#comment-16084614
 ] 

Pengcheng Xiong commented on HIVE-16907:


That is exactly what I am worrying about : Hive may not well support table name 
with ".". Could u evaluate the work that we need to do if we want to support 
this? Thanks.

>  "INSERT INTO"  overwrite old data when destination table encapsulated by 
> backquote 
> 
>
> Key: HIVE-16907
> URL: https://issues.apache.org/jira/browse/HIVE-16907
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0, 2.1.1
>Reporter: Nemon Lou
>Assignee: Bing Li
> Attachments: HIVE-16907.1.patch
>
>
> A way to reproduce:
> {noformat}
> create database tdb;
> use tdb;
> create table t1(id int);
> create table t2(id int);
> explain insert into `tdb.t1` select * from t2;
> {noformat}
> {noformat}
> +---+
> |  
> Explain  |
> +---+
> | STAGE DEPENDENCIES: 
>   |
> |   Stage-1 is a root stage   
>   |
> |   Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, 
> Stage-4  |
> |   Stage-3   
>   |
> |   Stage-0 depends on stages: Stage-3, Stage-2, Stage-5  
>   |
> |   Stage-2   
>   |
> |   Stage-4   
>   |
> |   Stage-5 depends on stages: Stage-4
>   |
> | 
>   |
> | STAGE PLANS:
>   |
> |   Stage: Stage-1
>   |
> | Map Reduce  
>   |
> |   Map Operator Tree:
>   |
> |   TableScan 
>   |
> | alias: t2   
>   |
> | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE |
> | Select Operator 
>   |
> |   expressions: id (type: int)   
>   |
> |   outputColumnNames: _col0  
>   |
> |   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE   |
> |   File Output Operator  
>   |
> | compressed: false   
>   |

[jira] [Updated] (HIVE-15144) JSON.org license is now CatX

2017-07-11 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15144:
---
Fix Version/s: 3.0.0
   2.3.0

> JSON.org license is now CatX
> 
>
> Key: HIVE-15144
> URL: https://issues.apache.org/jira/browse/HIVE-15144
> Project: Hive
>  Issue Type: Bug
>Reporter: Robert Kanter
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 2.2.0, 2.3.0, 3.0.0
>
> Attachments: HIVE-15144.patch, HIVE-15144.patch, HIVE-15144.patch, 
> HIVE-15144.patch
>
>
> per [update resolved legal|http://www.apache.org/legal/resolved.html#json]:
> {quote}
> CAN APACHE PRODUCTS INCLUDE WORKS LICENSED UNDER THE JSON LICENSE?
> No. As of 2016-11-03 this has been moved to the 'Category X' license list. 
> Prior to this, use of the JSON Java library was allowed. See Debian's page 
> for a list of alternatives.
> {quote}
> I'm not sure when this dependency was first introduced, but it looks like 
> it's currently used in a few places:
> https://github.com/apache/hive/search?p=1=%22org.json%22=%E2%9C%93



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15144) JSON.org license is now CatX

2017-07-11 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15144:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> JSON.org license is now CatX
> 
>
> Key: HIVE-15144
> URL: https://issues.apache.org/jira/browse/HIVE-15144
> Project: Hive
>  Issue Type: Bug
>Reporter: Robert Kanter
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 2.2.0, 2.3.0, 3.0.0
>
> Attachments: HIVE-15144.patch, HIVE-15144.patch, HIVE-15144.patch, 
> HIVE-15144.patch
>
>
> per [update resolved legal|http://www.apache.org/legal/resolved.html#json]:
> {quote}
> CAN APACHE PRODUCTS INCLUDE WORKS LICENSED UNDER THE JSON LICENSE?
> No. As of 2016-11-03 this has been moved to the 'Category X' license list. 
> Prior to this, use of the JSON Java library was allowed. See Debian's page 
> for a list of alternatives.
> {quote}
> I'm not sure when this dependency was first introduced, but it looks like 
> it's currently used in a few places:
> https://github.com/apache/hive/search?p=1=%22org.json%22=%E2%9C%93



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote

2017-07-11 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082649#comment-16082649
 ] 

Pengcheng Xiong commented on HIVE-16907:


thanks [~nemon] for discovering this and thanks [~libing] for the patch. 
However, it seems to me that although hive parse "`tdb.t1`" as a whole table 
name in AST, when it really processes it, it treats it as tdb.t1. Can u check 
other db's behavior, e.g., oracle and postgres, mysql for this? I doubt that 
there is a bug for table name when it contains "dot" in current hive.

>  "INSERT INTO"  overwrite old data when destination table encapsulated by 
> backquote 
> 
>
> Key: HIVE-16907
> URL: https://issues.apache.org/jira/browse/HIVE-16907
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0, 2.1.1
>Reporter: Nemon Lou
>Assignee: Bing Li
> Attachments: HIVE-16907.1.patch
>
>
> A way to reproduce:
> {noformat}
> create database tdb;
> use tdb;
> create table t1(id int);
> create table t2(id int);
> explain insert into `tdb.t1` select * from t2;
> {noformat}
> {noformat}
> +---+
> |  
> Explain  |
> +---+
> | STAGE DEPENDENCIES: 
>   |
> |   Stage-1 is a root stage   
>   |
> |   Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, 
> Stage-4  |
> |   Stage-3   
>   |
> |   Stage-0 depends on stages: Stage-3, Stage-2, Stage-5  
>   |
> |   Stage-2   
>   |
> |   Stage-4   
>   |
> |   Stage-5 depends on stages: Stage-4
>   |
> | 
>   |
> | STAGE PLANS:
>   |
> |   Stage: Stage-1
>   |
> | Map Reduce  
>   |
> |   Map Operator Tree:
>   |
> |   TableScan 
>   |
> | alias: t2   
>   |
> | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE |
> | Select Operator 
>   |
> |   expressions: id (type: int)   
>   |
> |   outputColumnNames: _col0  
>   |
> |   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE   |
> |   File Output Operator  
>   

[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-10 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Open  (was: Patch Available)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-10 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Patch Available  (was: Open)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-10 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Attachment: HIVE-16966.04.patch

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   4   5   6   7   8   9   10   >