I believe in your case, the “magic” happens in TableReader.fillObject <https://github.com/apache/spark/blob/4fa2fda88fc7beebb579ba808e400113b512533b/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L706-L712>. Here we unwrap the field value according to the object inspector of that field. It seems that somehow a FloatObjectInspector is specified for the total_price field. I don’t think CSVSerde is responsible for this, since it sets all field object inspectors to javaStringObjectInspector (here <https://github.com/ogrodnek/csv-serde/blob/f315c1ae4b21a8288eb939e7c10f3b29c1a854ef/src/main/java/com/bizo/hive/serde/csv/CSVSerde.java#L59-L61> ).
Which version of Spark SQL are you using? If you are using a snapshot version, please provide the exact Git commit hash. Thanks! On Tue, Aug 26, 2014 at 8:29 AM, chutium <teng....@gmail.com> wrote: > oops, i tried on a managed table, column types will not be changed > > so it is mostly due to the serde lib CSVSerDe > ( > https://github.com/ogrodnek/csv-serde/blob/master/src/main/java/com/bizo/hive/serde/csv/CSVSerde.java#L123 > ) > or maybe CSVReader from opencsv?... > > but if the columns are defined as string, no matter what type returned from > custom SerDe or CSVReader, they should be cast to string at the end right? > > why do not use the schema from hive metadata directly? > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/HiveContext-schemaRDD-printSchema-get-different-dataTypes-feature-or-a-bug-really-strange-and-surpri-tp8035p8039.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >