[jira] [Commented] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2016-05-16 Thread Bill Wailliam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286059#comment-15286059
 ] 

Bill Wailliam commented on HIVE-13745:
--

2016-05-17 10:54:32,779 FATAL [main] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {"time":"2016-05-17 10:43:54.09","offset":100}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:545)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFUnixTimeStamp.initializeInput(GenericUDFUnixTimeStamp.java:50)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initialize(GenericUDFToUnixTimeStamp.java:66)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:139)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:145)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:139)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:139)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:139)
at 
org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:76)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:164)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:535)
... 9 more


> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Bill Wailliam
>Assignee: Bill Wailliam
> Attachments: HIVE-13745.patch
>
>
> NullPointerException when current_date is used in mapreduce



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13746) Data duplication when insert overwrite

2016-05-12 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-13746:
-
Description: Data duplication when insert overwrite .The old data cannot be 
deleted

> Data duplication when insert overwrite 
> ---
>
> Key: HIVE-13746
> URL: https://issues.apache.org/jira/browse/HIVE-13746
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Bill Wailliam
>Priority: Critical
>
> Data duplication when insert overwrite .The old data cannot be deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2016-05-12 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-13745:
-
Attachment: HIVE-13745.patch

> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Bill Wailliam
>Assignee: Bill Wailliam
> Attachments: HIVE-13745.patch
>
>
> NullPointerException when current_date is used in mapreduce



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2016-05-12 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-13745:
-
Affects Version/s: 2.0.0
   Status: Patch Available  (was: Open)

> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Bill Wailliam
>Assignee: Bill Wailliam
>
> NullPointerException when current_date is used in mapreduce



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2016-05-12 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-13745:
-
Description: When I use current_date in mapreduce

> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Reporter: Bill Wailliam
>Assignee: Bill Wailliam
>
> When I use current_date in mapreduce



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2016-05-12 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-13745:
-
Description: NullPointerException when   (was: When I use current_date in 
mapreduce)

> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Reporter: Bill Wailliam
>Assignee: Bill Wailliam
>
> NullPointerException when 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2016-05-12 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-13745:
-
Description: NullPointerException when current_date is used in mapreduce  
(was: NullPointerException when )

> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Reporter: Bill Wailliam
>Assignee: Bill Wailliam
>
> NullPointerException when current_date is used in mapreduce



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6189) Support top level union all statements

2016-03-03 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-6189:

Description: 
I've always wondered why union all has to be in subqueries in hive.

After looking at it, problems are:

- Hive Parser:
  - Union happens at the wrong place (insert ... select ... union all select 
...) is parsed as (insert select) union select.
  - There are many rewrite rules in the parser to force any query into the a 
from - insert -select form. No doubt for historical reasons.
- Plan generation/semantic analysis assumes top level "TOK_QUERY" and not top 
level "TOK_UNION".

The rewrite rules don't work when we move the "UNION ALL" into the select 
statements. However, it's not hard to do that in code.

  was:
I've always wondered why union all has to be in subqueries in hive.

After looking at it, problems are:

- Hive Parser:
  - Union happens at the wrong place (insert ... select ... union all select 
...) is parsed as (insert select) union select.
  - There are many rewrite rules in the parser to force any query into the a 
from - insert -select form. No doubt for historical reasons.
- Plan generation/semantic analysis assumes top level "TOK_QUERY" and not top 
level "TOK_UNION".

The rewrite rules don't work when we move the "UNION ALL" recursion into the 
select statements. However, it's not hard to do that in code.


> Support top level union all statements
> --
>
> Key: HIVE-6189
> URL: https://issues.apache.org/jira/browse/HIVE-6189
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: 0.13.0
>
> Attachments: HIVE-6189.1.patch, HIVE-6189.2.patch, HIVE-6189.3.patch
>
>
> I've always wondered why union all has to be in subqueries in hive.
> After looking at it, problems are:
> - Hive Parser:
>   - Union happens at the wrong place (insert ... select ... union all select 
> ...) is parsed as (insert select) union select.
>   - There are many rewrite rules in the parser to force any query into the a 
> from - insert -select form. No doubt for historical reasons.
> - Plan generation/semantic analysis assumes top level "TOK_QUERY" and not top 
> level "TOK_UNION".
> The rewrite rules don't work when we move the "UNION ALL" into the select 
> statements. However, it's not hard to do that in code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12067) ORC read nullString(eg: \N) columns, can't return NULL

2015-10-08 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam reassigned HIVE-12067:


Assignee: Bill Wailliam

> ORC read nullString(eg: \N) columns, can't return NULL
> --
>
> Key: HIVE-12067
> URL: https://issues.apache.org/jira/browse/HIVE-12067
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Bill Wailliam
>Assignee: Bill Wailliam
>
> Text format:
> SQL:create table test_input as select '\\N' from table;
> hive> select * from test_input limit 3;
> OK
> NULL
> NULL
> NULL
> ===
> ORC format:
> set hive.default.fileformat=Orc;
> create table test_orc_input as select '\\N' from table;
> hive> select * from test_orc_input limit 3;
> OK
> \N
> \N
> \N



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12067) ORC read nullString(eg: \N) columns, can't return NULL

2015-10-08 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-12067:
-
Assignee: Ashutosh Chauhan

> ORC read nullString(eg: \N) columns, can't return NULL
> --
>
> Key: HIVE-12067
> URL: https://issues.apache.org/jira/browse/HIVE-12067
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Bill Wailliam
>Assignee: Ashutosh Chauhan
>
> Text format:
> SQL:create table test_input as select '\\N' from table;
> hive> select * from test_input limit 3;
> OK
> NULL
> NULL
> NULL
> ===
> ORC format:
> set hive.default.fileformat=Orc;
> create table test_orc_input as select '\\N' from table;
> hive> select * from test_orc_input limit 3;
> OK
> \N
> \N
> \N



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12067) ORC read nullString(eg: \N) columns, can't return NULL

2015-10-08 Thread Bill Wailliam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949734#comment-14949734
 ] 

Bill Wailliam commented on HIVE-12067:
--

Yes, it is better not to use special characters to be serialized, but modify 
LazySimpleSerDe encoding/decoding may affect the data now

> ORC read nullString(eg: \N) columns, can't return NULL
> --
>
> Key: HIVE-12067
> URL: https://issues.apache.org/jira/browse/HIVE-12067
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Bill Wailliam
>Assignee: Ashutosh Chauhan
>
> Text format:
> SQL:create table test_input as select '\\N' from table;
> hive> select * from test_input limit 3;
> OK
> NULL
> NULL
> NULL
> ===
> ORC format:
> set hive.default.fileformat=Orc;
> create table test_orc_input as select '\\N' from table;
> hive> select * from test_orc_input limit 3;
> OK
> \N
> \N
> \N



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11271) java.lang.IndexOutOfBoundsException when union all with if function

2015-09-14 Thread Bill Wailliam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14743211#comment-14743211
 ] 

Bill Wailliam commented on HIVE-11271:
--

The patch maybe not work.I use the latest patches, but still have the same 
problem.

> java.lang.IndexOutOfBoundsException when union all with if function
> ---
>
> Key: HIVE-11271
> URL: https://issues.apache.org/jira/browse/HIVE-11271
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11271.1.patch, HIVE-11271.2.patch, 
> HIVE-11271.3.patch, HIVE-11271.4.patch
>
>
> Some queries with Union all as subquery fail in MapReduce task with 
> stacktrace:
> {noformat}
> 15/07/15 14:19:30 [pool-13-thread-1]: INFO exec.UnionOperator: Initializing 
> operator UNION[104]
> 15/07/15 14:19:30 [Thread-72]: INFO mapred.LocalJobRunner: Map task executor 
> complete.
> 15/07/15 14:19:30 [Thread-72]: WARN mapred.LocalJobRunner: 
> job_local826862759_0005
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>   ... 10 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>   ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140)
>   ... 21 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at 
>