RE: HiveContext, schemaRDD.printSchema get different dataTypes, feature or a bug? really strange and surprised...

2014-09-01 Thread chutium
thanks a lot, Hao, finally solved this problem, changes of CSVSerDe are here: https://github.com/chutium/csv-serde/commit/22c667c003e705613c202355a8791978d790591e btw, "add jar" in spark hive or hive-thriftserver always doesn't work, we build the spark with libraryDependencies += "csv-serde" ...

RE: HiveContext, schemaRDD.printSchema get different dataTypes, feature or a bug? really strange and surprised...

2014-08-31 Thread Cheng, Hao
a get different dataTypes, feature or a bug? really strange and surprised... Hi Cheng, thank you very much for helping me to finally find out the secret of this magic... actually we defined this external table with SID STRING REQUEST_ID STRING TIMES_DQ TIMESTAMP TOTAL_PRICE

Re: HiveContext, schemaRDD.printSchema get different dataTypes, feature or a bug? really strange and surprised...

2014-08-31 Thread chutium
Hi Cheng, thank you very much for helping me to finally find out the secret of this magic... actually we defined this external table with SID STRING REQUEST_ID STRING TIMES_DQ TIMESTAMP TOTAL_PRICE FLOAT ... using "desc table ext_fullorders" it is only shown as [# col_name

Re: HiveContext, schemaRDD.printSchema get different dataTypes, feature or a bug? really strange and surprised...

2014-08-27 Thread Cheng Lian
I believe in your case, the “magic” happens in TableReader.fillObject . Here we unwrap the field value according to the object inspector of that f

Re: HiveContext, schemaRDD.printSchema get different dataTypes, feature or a bug? really strange and surprised...

2014-08-26 Thread chutium
oops, i tried on a managed table, column types will not be changed so it is mostly due to the serde lib CSVSerDe (https://github.com/ogrodnek/csv-serde/blob/master/src/main/java/com/bizo/hive/serde/csv/CSVSerde.java#L123) or maybe CSVReader from opencsv?... but if the columns are defined as strin