Re: Re: load data Failed with exception java.lang.IndexOutOfBoundsException

2016-09-09 Thread C R

drop table ods.loadtest;
create external table ods.loadtest
(
c1 string
)
stored as textfile
location '/tmp/loadtest';


hive> show create table ods.loadtest;
OK
CREATE EXTERNAL TABLE `ods.loadtest`(
  `c1` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://bidc/tmp/loadtest'
TBLPROPERTIES (
  'numFiles'='1',
  'totalSize'='4',
  'transient_lastDdlTime'='1473400143')


hive.default.fileformat
TextFile

  Expects one of [textfile, sequencefile, rcfile, orc].
  Default file format for CREATE TABLE statement. Users can explicitly 
override it by CREATE TABLE ... STORED AS [FORMAT]


> LOAD DATA LOCAL INPATH '1.dat' overwrite INTO TABLE ODS.loadtest;
Loading data to table ods.loadtest
Failed with exception java.lang.IndexOutOfBoundsException
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask


It will be ok if the file has more than two characters,that is a little 
interesting. I can not understand the result of function checkInputFormat is 
OrcInputFormat,maybe that is just right.

Thanks.


From: Stephen Sprague<mailto:sprag...@gmail.com>
Date: 2016-09-09 12:47
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Re: load data Failed with exception 
java.lang.IndexOutOfBoundsException
>at 
>org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.validateInput(OrcInputFormat.java:508)

would it be safe to assume that you are trying to load a text file into an 
table stored as ORC?

your create table doesn't specify that explicitly so that means you have a 
setting in your configs that says new tables are to be stored as ORC if not 
specified otherwise.

​
​too bad there isn't an error message like: "loading text data into into a 
non-TEXTFILE table generally isn't a good idea". :)

then again maybe somebody knows something i don't.

Cheers,
Stephen.​





On Thu, Sep 8, 2016 at 7:37 PM, C R 
<cuirong198...@hotmail.com<mailto:cuirong198...@hotmail.com>> wrote:

Yes, based on my testing,it is wrong from 0 to 99 with the content of file 
1.dat, whether the column type is string or int.

hive.log:

2016-09-09T09:10:40,978 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
CliDriver (SessionState.java:printInfo(1029)) - Hive-on-MR is deprecated in 
Hive 2 and may not be available in the future versions. Consider using a 
different execution engine (i.e. tez, spark) or using Hive 1.X releases.
2016-09-09T09:11:17,433 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
conf.HiveConf (HiveConf.java:getLogIdVar(3177)) - Using the default value 
passed in for log id: d1e08abd-5f8b-4149-a679-00ba6b4f4ab9
2016-09-09T09:11:17,462 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:compile(409)) - Compiling 
command(queryId=hadoop_20160909091117_2f9e8e3b-b2e8-4312-b473-535881c1d726): 
LOAD DATA LOCAL INPATH '1.dat' overwrite INTO TABLE ODS.loadtest
2016-09-09T09:11:18,016 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(670)) - 0: get_table : 
db=ODS tbl=loadtest
2016-09-09T09:11:18,016 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(280)) - ugi=hadoop   
ip=unknown-ip-addr  cmd=get_table : db=ODS tbl=loadtest
2016-09-09T09:11:18,162 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:compile(479)) - Semantic Analysis Completed
2016-09-09T09:11:18,163 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:getSchema(251)) - Returning Hive schema: 
Schema(fieldSchemas:null, properties:null)
2016-09-09T09:11:18,167 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:compile(551)) - Completed compiling 
command(queryId=hadoop_20160909091117_2f9e8e3b-b2e8-4312-b473-535881c1d726); 
Time taken: 0.725 seconds
2016-09-09T09:11:18,167 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:checkConcurrency(171)) - Concurrency mode is disabled, 
not creating a lock manager
2016-09-09T09:11:18,167 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:execute(1493)) - Executing 
command(queryId=hadoop_20160909091117_2f9e8e3b-b2e8-4312-b473-535881c1d726): 
LOAD DATA LOCAL INPATH '1.dat' overwrite INTO TABLE ODS.loadtest
2016-09-09T09:11:18,172 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:launchTask(1832)) - Starting task [Stage-0:MOVE] in 
serial mode
2016-09-09T09:11:18,172 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
exec.Task (SessionState.java:printInfo(1029)) - Loading data to table 
ods.loadtest from file:1.dat
2016-09-09T09:11:18,172 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(670)) - 0: get

Re: Re: load data Failed with exception java.lang.IndexOutOfBoundsException

2016-09-08 Thread C R

Yes, based on my testing,it is wrong from 0 to 99 with the content of file 
1.dat, whether the column type is string or int.

hive.log:

2016-09-09T09:10:40,978 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
CliDriver (SessionState.java:printInfo(1029)) - Hive-on-MR is deprecated in 
Hive 2 and may not be available in the future versions. Consider using a 
different execution engine (i.e. tez, spark) or using Hive 1.X releases.
2016-09-09T09:11:17,433 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
conf.HiveConf (HiveConf.java:getLogIdVar(3177)) - Using the default value 
passed in for log id: d1e08abd-5f8b-4149-a679-00ba6b4f4ab9
2016-09-09T09:11:17,462 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:compile(409)) - Compiling 
command(queryId=hadoop_20160909091117_2f9e8e3b-b2e8-4312-b473-535881c1d726): 
LOAD DATA LOCAL INPATH '1.dat' overwrite INTO TABLE ODS.loadtest
2016-09-09T09:11:18,016 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(670)) - 0: get_table : 
db=ODS tbl=loadtest
2016-09-09T09:11:18,016 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(280)) - ugi=hadoop   
ip=unknown-ip-addr  cmd=get_table : db=ODS tbl=loadtest
2016-09-09T09:11:18,162 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:compile(479)) - Semantic Analysis Completed
2016-09-09T09:11:18,163 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:getSchema(251)) - Returning Hive schema: 
Schema(fieldSchemas:null, properties:null)
2016-09-09T09:11:18,167 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:compile(551)) - Completed compiling 
command(queryId=hadoop_20160909091117_2f9e8e3b-b2e8-4312-b473-535881c1d726); 
Time taken: 0.725 seconds
2016-09-09T09:11:18,167 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:checkConcurrency(171)) - Concurrency mode is disabled, 
not creating a lock manager
2016-09-09T09:11:18,167 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:execute(1493)) - Executing 
command(queryId=hadoop_20160909091117_2f9e8e3b-b2e8-4312-b473-535881c1d726): 
LOAD DATA LOCAL INPATH '1.dat' overwrite INTO TABLE ODS.loadtest
2016-09-09T09:11:18,172 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
ql.Driver (Driver.java:launchTask(1832)) - Starting task [Stage-0:MOVE] in 
serial mode
2016-09-09T09:11:18,172 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
exec.Task (SessionState.java:printInfo(1029)) - Loading data to table 
ods.loadtest from file:1.dat
2016-09-09T09:11:18,172 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(670)) - 0: get_table : 
db=ods tbl=loadtest
2016-09-09T09:11:18,173 INFO  [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(280)) - ugi=hadoop   
ip=unknown-ip-addr  cmd=get_table : db=ods tbl=loadtest
2016-09-09T09:11:18,320 ERROR [d1e08abd-5f8b-4149-a679-00ba6b4f4ab9 main]: 
exec.Task (SessionState.java:printError(1038)) - Failed with exception 
java.lang.IndexOutOfBoundsException
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.IndexOutOfBoundsException
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.checkInputFormat(HiveFileFormatUtils.java:195)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.checkTextInputFormat(HiveFileFormatUtils.java:217)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.checkInputFormat(HiveFileFormatUtils.java:182)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:306)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:158)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1834)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1578)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1355)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1178)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1166)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:236)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:782)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:721)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:648)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at