I can see its trying to read the parquet and failing while decompressing using snappy: parquet.hadoop.ParquetRecordReader.nextKeyValue( ParquetRecordReader.java:201)
So the table looks good but this needs to be fixed before you can query the data in hive. Thanks Deepak On Sun, Feb 11, 2018 at 1:45 PM, ☼ R Nair (रविशंकर नायर) < ravishankar.n...@gmail.com> wrote: > When I do that , and then do a select, full of errors. I think Hive table > to read. > > select * from mine; > OK > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317) > at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) > at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44) > at parquet.hadoop.codec.SnappyDecompressor.decompress( > SnappyDecompressor.java:62) > at parquet.hadoop.codec.NonBlockedDecompressorStream.read( > NonBlockedDecompressorStream.java:51) > at java.io.DataInputStream.readFully(DataInputStream.java:195) > at java.io.DataInputStream.readFully(DataInputStream.java:169) > at parquet.bytes.BytesInput$StreamBytesInput.toByteArray( > BytesInput.java:204) > at parquet.column.impl.ColumnReaderImpl.readPageV1( > ColumnReaderImpl.java:557) > at parquet.column.impl.ColumnReaderImpl.access$300( > ColumnReaderImpl.java:57) > at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:516) > at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:513) > at parquet.column.page.DataPageV1.accept(DataPageV1.java:96) > at parquet.column.impl.ColumnReaderImpl.readPage( > ColumnReaderImpl.java:513) > at parquet.column.impl.ColumnReaderImpl.checkRead( > ColumnReaderImpl.java:505) > at parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:607) > at parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:351) > at parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader( > ColumnReadStoreImpl.java:66) > at parquet.column.impl.ColumnReadStoreImpl.getColumnReader( > ColumnReadStoreImpl.java:61) > at parquet.io.RecordReaderImplementation.<init>( > RecordReaderImplementation.java:270) > at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:134) > at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:99) > at parquet.filter2.compat.FilterCompat$NoOpFilter. > accept(FilterCompat.java:154) > at parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:99) > at parquet.hadoop.InternalParquetRecordReader.checkRead( > InternalParquetRecordReader.java:137) > at parquet.hadoop.InternalParquetRecordReader.nextKeyValue( > InternalParquetRecordReader.java:208) > at parquet.hadoop.ParquetRecordReader.nextKeyValue( > ParquetRecordReader.java:201) > at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.< > init>(ParquetRecordReaderWrapper.java:122) > at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.< > init>(ParquetRecordReaderWrapper.java:85) > at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat. > getRecordReader(MapredParquetInputFormat.java:72) > at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit. > getRecordReader(FetchOperator.java:673) > at org.apache.hadoop.hive.ql.exec.FetchOperator. > getRecordReader(FetchOperator.java:323) > at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow( > FetchOperator.java:445) > at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow( > FetchOperator.java:414) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd( > CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in > java.library.path > at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) > at java.lang.Runtime.loadLibrary0(Runtime.java:870) > at java.lang.System.loadLibrary(System.java:1122) > at org.xerial.snappy.SnappyNativeLoader.loadLibrary( > SnappyNativeLoader.java:52) > ... 52 more > Exception in thread "main" org.xerial.snappy.SnappyError: > [FAILED_TO_LOAD_NATIVE_LIBRARY] null > at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229) > at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44) > at parquet.hadoop.codec.SnappyDecompressor.decompress( > SnappyDecompressor.java:62) > at parquet.hadoop.codec.NonBlockedDecompressorStream.read( > NonBlockedDecompressorStream.java:51) > at java.io.DataInputStream.readFully(DataInputStream.java:195) > at java.io.DataInputStream.readFully(DataInputStream.java:169) > at parquet.bytes.BytesInput$StreamBytesInput.toByteArray( > BytesInput.java:204) > at parquet.column.impl.ColumnReaderImpl.readPageV1( > ColumnReaderImpl.java:557) > at parquet.column.impl.ColumnReaderImpl.access$300( > ColumnReaderImpl.java:57) > at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:516) > at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:513) > at parquet.column.page.DataPageV1.accept(DataPageV1.java:96) > at parquet.column.impl.ColumnReaderImpl.readPage( > ColumnReaderImpl.java:513) > at parquet.column.impl.ColumnReaderImpl.checkRead( > ColumnReaderImpl.java:505) > at parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:607) > at parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:351) > at parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader( > ColumnReadStoreImpl.java:66) > at parquet.column.impl.ColumnReadStoreImpl.getColumnReader( > ColumnReadStoreImpl.java:61) > at parquet.io.RecordReaderImplementation.<init>( > RecordReaderImplementation.java:270) > at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:134) > at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:99) > at parquet.filter2.compat.FilterCompat$NoOpFilter. > accept(FilterCompat.java:154) > at parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:99) > at parquet.hadoop.InternalParquetRecordReader.checkRead( > InternalParquetRecordReader.java:137) > at parquet.hadoop.InternalParquetRecordReader.nextKeyValue( > InternalParquetRecordReader.java:208) > at parquet.hadoop.ParquetRecordReader.nextKeyValue( > ParquetRecordReader.java:201) > at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.< > init>(ParquetRecordReaderWrapper.java:122) > at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.< > init>(ParquetRecordReaderWrapper.java:85) > at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat. > getRecordReader(MapredParquetInputFormat.java:72) > at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit. > getRecordReader(FetchOperator.java:673) > at org.apache.hadoop.hive.ql.exec.FetchOperator. > getRecordReader(FetchOperator.java:323) > at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow( > FetchOperator.java:445) > at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow( > FetchOperator.java:414) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd( > CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Feb 11, 2018 3:14:06 AM WARNING: parquet.hadoop.ParquetRecordReader: Can > not initialize counter due to context is not a instance of > TaskInputOutputContext, but is org.apache.hadoop.mapreduce. > task.TaskAttemptContextImpl > Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: > RecordReader initialized will read a total of 36635 records. > Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: > at row 0. reading next block > Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: > block read in memory in 27 ms. row count = 36635 > > > On Sun, Feb 11, 2018 at 3:10 AM, Deepak Sharma <deepakmc...@gmail.com> > wrote: > >> There was a typo: >> Instead of : >> alter table mine set locations "hdfs://localhost:8020/user/hi >> ve/warehouse/mine"; >> >> Use : >> alter table mine set location "hdfs://localhost:8020/user/hi >> ve/warehouse/mine"; >> >> On Sun, Feb 11, 2018 at 1:38 PM, Deepak Sharma <deepakmc...@gmail.com> >> wrote: >> >>> Try this in hive: >>> alter table mine set locations "hdfs://localhost:8020/user/hi >>> ve/warehouse/mine"; >>> >>> Thanks >>> Deepak >>> >>> On Sun, Feb 11, 2018 at 1:24 PM, ☼ R Nair (रविशंकर नायर) < >>> ravishankar.n...@gmail.com> wrote: >>> >>>> Hi, >>>> Here you go: >>>> >>>> hive> show create table mine; >>>> OK >>>> CREATE TABLE `mine`( >>>> `policyid` int, >>>> `statecode` string, >>>> `socialid` string, >>>> `county` string, >>>> `eq_site_limit` decimal(10,2), >>>> `hu_site_limit` decimal(10,2), >>>> `fl_site_limit` decimal(10,2), >>>> `fr_site_limit` decimal(10,2), >>>> `tiv_2014` decimal(10,2), >>>> `tiv_2015` decimal(10,2), >>>> `eq_site_deductible` int, >>>> `hu_site_deductible` int, >>>> `fl_site_deductible` int, >>>> `fr_site_deductible` int, >>>> `latitude` decimal(6,6), >>>> `longitude` decimal(6,6), >>>> `line` string, >>>> `construction` string, >>>> `point_granularity` int) >>>> ROW FORMAT SERDE >>>> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' >>>> WITH SERDEPROPERTIES ( >>>> 'path'='hdfs://localhost:8020/user/hive/warehouse/mine') >>>> STORED AS INPUTFORMAT >>>> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' >>>> OUTPUTFORMAT >>>> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' >>>> LOCATION >>>> 'file:/Users/ravishankarnair/spark-warehouse/mine' >>>> TBLPROPERTIES ( >>>> 'spark.sql.sources.provider'='parquet', >>>> 'spark.sql.sources.schema.numParts'='1', >>>> 'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fi >>>> elds\":[{\"name\":\"policyid\",\"type\":\"integer\",\"nullab >>>> le\":true,\"metadata\":{\"name\":\"policyid\",\"scale\":0}}, >>>> {\"name\":\"statecode\",\"type\":\"string\",\"nullable\": >>>> true,\"metadata\":{\"name\":\"statecode\",\"scale\":0}},{\"n >>>> ame\":\"Socialid\",\"type\":\"string\",\"nullable\":true,\"m >>>> etadata\":{\"name\":\"Socialid\",\"scale\":0}},{\"name\":\" >>>> county\",\"type\":\"string\",\"nullable\":true,\"metadata\": >>>> {\"name\":\"county\",\"scale\":0}},{\"name\":\"eq_site_ >>>> limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\" >>>> metadata\":{\"name\":\"eq_site_limit\",\"scale\":2}},{\" >>>> name\":\"hu_site_limit\",\"type\":\"decimal(10,2)\",\"nullab >>>> le\":true,\"metadata\":{\"name\":\"hu_site_limit\",\"sca >>>> le\":2}},{\"name\":\"fl_site_limit\",\"type\":\"decimal(10, >>>> 2)\",\"nullable\":true,\"metadata\":{\"name\":\"fl_ >>>> site_limit\",\"scale\":2}},{\"name\":\"fr_site_limit\",\"typ >>>> e\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\" >>>> name\":\"fr_site_limit\",\"scale\":2}},{\"name\":\"tiv_2014\ >>>> ",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\": >>>> {\"name\":\"tiv_2014\",\"scale\":2}},{\"name\":\"tiv_ >>>> 2015\",\"type\":\"decimal(10,2)\",\"nullable\":true,\" >>>> metadata\":{\"name\":\"tiv_2015\",\"scale\":2}},{\"name\" >>>> :\"eq_site_deductible\",\"type\":\"integer\",\"nullable\ >>>> ":true,\"metadata\":{\"name\":\"eq_site_deductible\",\" >>>> scale\":0}},{\"name\":\"hu_site_deductible\",\"type\":\" >>>> integer\",\"nullable\":true,\"metadata\":{\"name\":\"hu_site >>>> _deductible\",\"scale\":0}},{\"name\":\"fl_site_deductible\" >>>> ,\"type\":\"integer\",\"nullable\":true,\"metadata\":{ >>>> \"name\":\"fl_site_deductible\",\"scale\":0}},{\"name\":\" >>>> fr_site_deductible\",\"type\":\"integer\",\"nullable\":true, >>>> \"metadata\":{\"name\":\"fr_site_deductible\",\"scale\":0} >>>> },{\"name\":\"latitude\",\"type\":\"decimal(6,6)\",\" >>>> nullable\":true,\"metadata\":{\"name\":\"latitude\",\"scale\ >>>> ":6}},{\"name\":\"longitude\",\"type\":\"decimal(6,6)\",\" >>>> nullable\":true,\"metadata\":{\"name\":\"longitude\",\" >>>> scale\":6}},{\"name\":\"line\",\"type\":\"string\",\" >>>> nullable\":true,\"metadata\":{\"name\":\"line\",\"scale\":0} >>>> },{\"name\":\"construction\",\"type\":\"string\",\"nullable\ >>>> ":true,\"metadata\":{\"name\":\"construction\",\"scale\":0}} >>>> ,{\"name\":\"point_granularity\",\"type\":\"integer\",\" >>>> nullable\":true,\"metadata\":{\"name\":\"point_granularity\" >>>> ,\"scale\":0}}]}', >>>> 'transient_lastDdlTime'='1518335598') >>>> Time taken: 0.13 seconds, Fetched: 35 row(s) >>>> >>>> On Sun, Feb 11, 2018 at 2:36 AM, Shmuel Blitz < >>>> shmuel.bl...@similarweb.com> wrote: >>>> >>>>> Please run the following command, and paste the result: >>>>> SHOW CREATE TABLE <<TABLE-NAME>> >>>>> >>>>> On Sun, Feb 11, 2018 at 7:56 AM, ☼ R Nair (रविशंकर नायर) < >>>>> ravishankar.n...@gmail.com> wrote: >>>>> >>>>>> No, No luck. >>>>>> >>>>>> Thanks >>>>>> >>>>>> On Sun, Feb 11, 2018 at 12:48 AM, Deepak Sharma < >>>>>> deepakmc...@gmail.com> wrote: >>>>>> >>>>>>> In hive cli: >>>>>>> msck repair table 《table_name》; >>>>>>> >>>>>>> Thanks >>>>>>> Deepak >>>>>>> >>>>>>> On Feb 11, 2018 11:14, "☼ R Nair (रविशंकर नायर)" < >>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>> >>>>>>>> NO, can you pease explain the command ? Let me try now. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> On Sun, Feb 11, 2018 at 12:40 AM, Deepak Sharma < >>>>>>>> deepakmc...@gmail.com> wrote: >>>>>>>> >>>>>>>>> I am not sure about the exact issue bjt i see you are partioning >>>>>>>>> while writing from spark. >>>>>>>>> Did you tried msck repair on the table before reading it in hive ? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Deepak >>>>>>>>> >>>>>>>>> On Feb 11, 2018 11:06, "☼ R Nair (रविशंकर नायर)" < >>>>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> All, >>>>>>>>>> >>>>>>>>>> Thanks for the inputs. Again I am not successful. I think, we >>>>>>>>>> need to resolve this, as this is a very common requirement. >>>>>>>>>> >>>>>>>>>> Please go through my complete code: >>>>>>>>>> >>>>>>>>>> STEP 1: Started Spark shell as spark-shell --master yarn >>>>>>>>>> >>>>>>>>>> STEP 2: Flowing code is being given as inout to shark shell >>>>>>>>>> >>>>>>>>>> import org.apache.spark.sql.Row >>>>>>>>>> import org.apache.spark.sql.SparkSession >>>>>>>>>> val warehouseLocation ="/user/hive/warehouse" >>>>>>>>>> >>>>>>>>>> val spark = SparkSession.builder().appName("Spark Hive >>>>>>>>>> Example").config("spark.sql.warehouse.dir", >>>>>>>>>> warehouseLocation).enableHiveSupport().getOrCreate() >>>>>>>>>> >>>>>>>>>> import org.apache.spark.sql._ >>>>>>>>>> var passion_df = spark.read. >>>>>>>>>> format("jdbc"). >>>>>>>>>> option("url", "jdbc:mysql://localhost:3307/policies"). >>>>>>>>>> option("driver" ,"com.mysql.jdbc.Driver"). >>>>>>>>>> option("user", "root"). >>>>>>>>>> option("password", "root"). >>>>>>>>>> option("dbtable", "insurancedetails"). >>>>>>>>>> option("partitionColumn", "policyid"). >>>>>>>>>> option("lowerBound", "1"). >>>>>>>>>> option("upperBound", "100000"). >>>>>>>>>> option("numPartitions", "4"). >>>>>>>>>> load() >>>>>>>>>> //Made sure that passion_df is created, as passion_df.show(5) >>>>>>>>>> shows me correct data. >>>>>>>>>> passion_df.write.saveAsTable("default.mine") //Default parquet >>>>>>>>>> >>>>>>>>>> STEP 3: Went to HIVE. Started HIVE prompt. >>>>>>>>>> >>>>>>>>>> hive> show tables; >>>>>>>>>> OK >>>>>>>>>> callcentervoicelogs >>>>>>>>>> mine >>>>>>>>>> Time taken: 0.035 seconds, Fetched: 2 row(s) >>>>>>>>>> //As you can see HIVE is showing the table "mine" in default >>>>>>>>>> schema. >>>>>>>>>> >>>>>>>>>> STEP 4: HERE IS THE PROBLEM. >>>>>>>>>> >>>>>>>>>> hive> select * from mine; >>>>>>>>>> OK >>>>>>>>>> Time taken: 0.354 seconds >>>>>>>>>> hive> >>>>>>>>>> //Where is the data ??? >>>>>>>>>> >>>>>>>>>> STEP 5: >>>>>>>>>> >>>>>>>>>> See the below command on HIVE >>>>>>>>>> >>>>>>>>>> describe formatted mine; >>>>>>>>>> OK >>>>>>>>>> # col_name data_type comment >>>>>>>>>> >>>>>>>>>> policyid int >>>>>>>>>> statecode string >>>>>>>>>> socialid string >>>>>>>>>> county string >>>>>>>>>> eq_site_limit decimal(10,2) >>>>>>>>>> hu_site_limit decimal(10,2) >>>>>>>>>> fl_site_limit decimal(10,2) >>>>>>>>>> fr_site_limit decimal(10,2) >>>>>>>>>> tiv_2014 decimal(10,2) >>>>>>>>>> tiv_2015 decimal(10,2) >>>>>>>>>> eq_site_deductible int >>>>>>>>>> hu_site_deductible int >>>>>>>>>> fl_site_deductible int >>>>>>>>>> fr_site_deductible int >>>>>>>>>> latitude decimal(6,6) >>>>>>>>>> longitude decimal(6,6) >>>>>>>>>> line string >>>>>>>>>> construction string >>>>>>>>>> point_granularity int >>>>>>>>>> >>>>>>>>>> # Detailed Table Information >>>>>>>>>> Database: default >>>>>>>>>> Owner: ravishankarnair >>>>>>>>>> CreateTime: Sun Feb 11 00:26:40 EST 2018 >>>>>>>>>> LastAccessTime: UNKNOWN >>>>>>>>>> Protect Mode: None >>>>>>>>>> Retention: 0 >>>>>>>>>> Location: file:/Users/ravishankarnair/sp >>>>>>>>>> ark-warehouse/mine >>>>>>>>>> Table Type: MANAGED_TABLE >>>>>>>>>> Table Parameters: >>>>>>>>>> spark.sql.sources.provider parquet >>>>>>>>>> spark.sql.sources.schema.numParts 1 >>>>>>>>>> spark.sql.sources.schema.part.0 {\"type\":\"struct\",\"fields\ >>>>>>>>>> ":[{\"name\":\"policyid\",\"type\":\"integer\",\"nullable\": >>>>>>>>>> true,\"metadata\":{\"name\":\"policyid\",\"scale\":0}},{\"na >>>>>>>>>> me\":\"statecode\",\"type\":\"string\",\"nullable\":true,\"m >>>>>>>>>> etadata\":{\"name\":\"statecode\",\"scale\":0}},{\"name\":\" >>>>>>>>>> Socialid\",\"type\":\"string\",\"nullable\":true,\"metadata\ >>>>>>>>>> ":{\"name\":\"Socialid\",\"scale\":0}},{\"name\":\"county\", >>>>>>>>>> \"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\" >>>>>>>>>> :\"county\",\"scale\":0}},{\"name\":\"eq_site_limit\",\"type >>>>>>>>>> \":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\ >>>>>>>>>> ":\"eq_site_limit\",\"scale\":2}},{\"name\":\"hu_site_limit\ >>>>>>>>>> ",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\": >>>>>>>>>> {\"name\":\"hu_site_limit\",\"scale\":2}},{\"name\":\"fl_sit >>>>>>>>>> e_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"met >>>>>>>>>> adata\":{\"name\":\"fl_site_limit\",\"scale\":2}},{\"name\": >>>>>>>>>> \"fr_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":tr >>>>>>>>>> ue,\"metadata\":{\"name\":\"fr_site_limit\",\"scale\":2}},{\ >>>>>>>>>> "name\":\"tiv_2014\",\"type\":\"decimal(10,2)\",\"nullable\" >>>>>>>>>> :true,\"metadata\":{\"name\":\"tiv_2014\",\"scale\":2}},{\"n >>>>>>>>>> ame\":\"tiv_2015\",\"type\":\"decimal(10,2)\",\"nullable\":t >>>>>>>>>> rue,\"metadata\":{\"name\":\"tiv_2015\",\"scale\":2}},{\"nam >>>>>>>>>> e\":\"eq_site_deductible\",\"type\":\"integer\",\"nullable\" >>>>>>>>>> :true,\"metadata\":{\"name\":\"eq_site_deductible\",\"scale\ >>>>>>>>>> ":0}},{\"name\":\"hu_site_deductible\",\"type\":\"intege >>>>>>>>>> r\",\"nullable\":true,\"metadata\":{\"name\":\"hu_site_ >>>>>>>>>> deductible\",\"scale\":0}},{\"name\":\"fl_site_deductible\", >>>>>>>>>> \"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\ >>>>>>>>>> ":\"fl_site_deductible\",\"scale\":0}},{\"name\":\"fr_ >>>>>>>>>> site_deductible\",\"type\":\"integer\",\"nullable\":true,\" >>>>>>>>>> metadata\":{\"name\":\"fr_site_deductible\",\"scale\":0}},{\ >>>>>>>>>> "name\":\"latitude\",\"type\":\"decimal(6,6)\",\"nullable\": >>>>>>>>>> true,\"metadata\":{\"name\":\"latitude\",\"scale\":6}},{\" >>>>>>>>>> name\":\"longitude\",\"type\":\"decimal(6,6)\",\"nullable\": >>>>>>>>>> true,\"metadata\":{\"name\":\"longitude\",\"scale\":6}},{\" >>>>>>>>>> name\":\"line\",\"type\":\"string\",\"nullable\":true,\" >>>>>>>>>> metadata\":{\"name\":\"line\",\"scale\":0}},{\"name\":\" >>>>>>>>>> construction\",\"type\":\"string\",\"nullable\":true,\" >>>>>>>>>> metadata\":{\"name\":\"construction\",\"scale\":0}},{ >>>>>>>>>> \"name\":\"point_granularity\",\"type\":\"integer\",\"nullab >>>>>>>>>> le\":true,\"metadata\":{\"name\":\"point_granularity\",\" >>>>>>>>>> scale\":0}}]} >>>>>>>>>> transient_lastDdlTime 1518326800 >>>>>>>>>> >>>>>>>>>> # Storage Information >>>>>>>>>> SerDe Library: org.apache.hadoop.hive.ql.io.p >>>>>>>>>> arquet.serde.ParquetHiveSerDe >>>>>>>>>> InputFormat: org.apache.hadoop.hive.ql.io.p >>>>>>>>>> arquet.MapredParquetInputFormat >>>>>>>>>> OutputFormat: org.apache.hadoop.hive.ql.io.p >>>>>>>>>> arquet.MapredParquetOutputFormat >>>>>>>>>> Compressed: No >>>>>>>>>> Num Buckets: -1 >>>>>>>>>> Bucket Columns: [] >>>>>>>>>> Sort Columns: [] >>>>>>>>>> Storage Desc Params: >>>>>>>>>> path hdfs://localhost:8020/user/hiv >>>>>>>>>> e/warehouse/mine >>>>>>>>>> serialization.format 1 >>>>>>>>>> Time taken: 0.077 seconds, Fetched: 48 row(s) >>>>>>>>>> >>>>>>>>>> Now, I see your advise and support. Whats the issue? Am I doing >>>>>>>>>> wrong, it it a bug ? I am using Spark 2.2.1, HIVE 1.2.1, HADOOP >>>>>>>>>> 2.7.3. All >>>>>>>>>> class path, configuration are set properly. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Ravion >>>>>>>>>> >>>>>>>>>> On Fri, Feb 9, 2018 at 1:29 PM, Nicholas Hakobian < >>>>>>>>>> nicholas.hakob...@rallyhealth.com> wrote: >>>>>>>>>> >>>>>>>>>>> Its possible that the format of your table is not compatible >>>>>>>>>>> with your version of hive, so Spark saved it in a way such that >>>>>>>>>>> only Spark >>>>>>>>>>> can read it. When this happens it prints out a very visible warning >>>>>>>>>>> letting >>>>>>>>>>> you know this has happened. >>>>>>>>>>> >>>>>>>>>>> We've seen it most frequently when trying to save a parquet file >>>>>>>>>>> with a column in date format into a Hive table. In older versions >>>>>>>>>>> of hive, >>>>>>>>>>> its parquet reader/writer did not support Date formats (among a >>>>>>>>>>> couple >>>>>>>>>>> others). >>>>>>>>>>> >>>>>>>>>>> Nicholas Szandor Hakobian, Ph.D. >>>>>>>>>>> Staff Data Scientist >>>>>>>>>>> Rally Health >>>>>>>>>>> nicholas.hakob...@rallyhealth.com >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Feb 9, 2018 at 9:59 AM, Prakash Joshi < >>>>>>>>>>> prakashcjos...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Ravi, >>>>>>>>>>>> >>>>>>>>>>>> Can you send the result of >>>>>>>>>>>> Show create table your_table_name >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> Prakash >>>>>>>>>>>> >>>>>>>>>>>> On Feb 9, 2018 8:20 PM, "☼ R Nair (रविशंकर नायर)" < >>>>>>>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> All, >>>>>>>>>>>>> >>>>>>>>>>>>> It has been three days continuously I am on this issue. Not >>>>>>>>>>>>> getting any clue. >>>>>>>>>>>>> >>>>>>>>>>>>> Environment: Spark 2.2.x, all configurations are correct. >>>>>>>>>>>>> hive-site.xml is in spark's conf. >>>>>>>>>>>>> >>>>>>>>>>>>> 1) Step 1: I created a data frame DF1 reading a csv file. >>>>>>>>>>>>> >>>>>>>>>>>>> 2) Did manipulations on DF1. Resulting frame is passion_df. >>>>>>>>>>>>> >>>>>>>>>>>>> 3) passion_df.write.format("orc") >>>>>>>>>>>>> .saveAsTable("sampledb.passion") >>>>>>>>>>>>> >>>>>>>>>>>>> 4) The metastore shows the hive table., when I do "show >>>>>>>>>>>>> tables" in HIVE, I can see table name >>>>>>>>>>>>> >>>>>>>>>>>>> 5) I can't select in HIVE, though I can select from SPARK as >>>>>>>>>>>>> spark.sql("select * from sampledb.passion") >>>>>>>>>>>>> >>>>>>>>>>>>> Whats going on here? Please help. Why I am not seeing data >>>>>>>>>>>>> from HIVE prompt? >>>>>>>>>>>>> The "describe formatted " command on the table in HIVE shows >>>>>>>>>>>>> he data is is in default warehouse location ( >>>>>>>>>>>>> /user/hive/warehouse) since I >>>>>>>>>>>>> set it. >>>>>>>>>>>>> >>>>>>>>>>>>> I am not getting any definite answer anywhere. Many >>>>>>>>>>>>> suggestions and answers given in Stackoverflow et al.Nothing >>>>>>>>>>>>> really works. >>>>>>>>>>>>> >>>>>>>>>>>>> So asking experts here for some light on this, thanks >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Ravion >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Shmuel Blitz >>>>> Big Data Developer >>>>> Email: shmuel.bl...@similarweb.com >>>>> www.similarweb.com >>>>> <https://www.facebook.com/SimilarWeb/> >>>>> <https://www.linkedin.com/company/429838/> >>>>> <https://twitter.com/similarweb> >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> >>> >>> >>> -- >>> Thanks >>> Deepak >>> www.bigdatabig.com >>> www.keosha.net >>> >> >> >> >> -- >> Thanks >> Deepak >> www.bigdatabig.com >> www.keosha.net >> > > > > -- > > -- Thanks Deepak www.bigdatabig.com www.keosha.net