Hi, So , is this a bug, or something I need to fix? If its our issue, how can we fix? Please help.
Best, On Sun, Feb 11, 2018 at 3:49 AM, Shmuel Blitz <shmuel.bl...@similarweb.com> wrote: > Your table is missing a "PARTITIONED BY " section. > > Spark 2.x save the partition information in the TBLPROPERTIES section. > > > On Sun, Feb 11, 2018 at 10:41 AM, Deepak Sharma <deepakmc...@gmail.com> > wrote: > >> I can see its trying to read the parquet and failing while decompressing >> using snappy: >> parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecor >> dReader.java:201) >> >> So the table looks good but this needs to be fixed before you can query >> the data in hive. >> >> Thanks >> Deepak >> >> On Sun, Feb 11, 2018 at 1:45 PM, ☼ R Nair (रविशंकर नायर) < >> ravishankar.n...@gmail.com> wrote: >> >>> When I do that , and then do a select, full of errors. I think Hive >>> table to read. >>> >>> select * from mine; >>> OK >>> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". >>> SLF4J: Defaulting to no-operation (NOP) logger implementation >>> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for >>> further details. >>> java.lang.reflect.InvocationTargetException >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >>> ssorImpl.java:62) >>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >>> thodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:498) >>> at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoade >>> r.java:317) >>> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) >>> at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44) >>> at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDec >>> ompressor.java:62) >>> at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBl >>> ockedDecompressorStream.java:51) >>> at java.io.DataInputStream.readFully(DataInputStream.java:195) >>> at java.io.DataInputStream.readFully(DataInputStream.java:169) >>> at parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesI >>> nput.java:204) >>> at parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReader >>> Impl.java:557) >>> at parquet.column.impl.ColumnReaderImpl.access$300(ColumnReader >>> Impl.java:57) >>> at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImp >>> l.java:516) >>> at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImp >>> l.java:513) >>> at parquet.column.page.DataPageV1.accept(DataPageV1.java:96) >>> at parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderIm >>> pl.java:513) >>> at parquet.column.impl.ColumnReaderImpl.checkRead(ColumnReaderI >>> mpl.java:505) >>> at parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImp >>> l.java:607) >>> at parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl >>> .java:351) >>> at parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(C >>> olumnReadStoreImpl.java:66) >>> at parquet.column.impl.ColumnReadStoreImpl.getColumnReader(Colu >>> mnReadStoreImpl.java:61) >>> at parquet.io.RecordReaderImplementation.<init>(RecordReaderImp >>> lementation.java:270) >>> at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:134) >>> at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:99) >>> at parquet.filter2.compat.FilterCompat$NoOpFilter.accept(Filter >>> Compat.java:154) >>> at parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:99) >>> at parquet.hadoop.InternalParquetRecordReader.checkRead(Interna >>> lParquetRecordReader.java:137) >>> at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(Inte >>> rnalParquetRecordReader.java:208) >>> at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecor >>> dReader.java:201) >>> at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReade >>> rWrapper.<init>(ParquetRecordReaderWrapper.java:122) >>> at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReade >>> rWrapper.<init>(ParquetRecordReaderWrapper.java:85) >>> at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputForma >>> t.getRecordReader(MapredParquetInputFormat.java:72) >>> at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputForma >>> tSplit.getRecordReader(FetchOperator.java:673) >>> at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader >>> (FetchOperator.java:323) >>> at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(Fetc >>> hOperator.java:445) >>> at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOp >>> erator.java:414) >>> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) >>> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670) >>> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriv >>> er.java:233) >>> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) >>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) >>> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver >>> .java:736) >>> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) >>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >>> ssorImpl.java:62) >>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >>> thodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:498) >>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) >>> Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in >>> java.library.path >>> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) >>> at java.lang.Runtime.loadLibrary0(Runtime.java:870) >>> at java.lang.System.loadLibrary(System.java:1122) >>> at org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativ >>> eLoader.java:52) >>> ... 52 more >>> Exception in thread "main" org.xerial.snappy.SnappyError: >>> [FAILED_TO_LOAD_NATIVE_LIBRARY] null >>> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229) >>> at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44) >>> at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDec >>> ompressor.java:62) >>> at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBl >>> ockedDecompressorStream.java:51) >>> at java.io.DataInputStream.readFully(DataInputStream.java:195) >>> at java.io.DataInputStream.readFully(DataInputStream.java:169) >>> at parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesI >>> nput.java:204) >>> at parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReader >>> Impl.java:557) >>> at parquet.column.impl.ColumnReaderImpl.access$300(ColumnReader >>> Impl.java:57) >>> at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImp >>> l.java:516) >>> at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImp >>> l.java:513) >>> at parquet.column.page.DataPageV1.accept(DataPageV1.java:96) >>> at parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderIm >>> pl.java:513) >>> at parquet.column.impl.ColumnReaderImpl.checkRead(ColumnReaderI >>> mpl.java:505) >>> at parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImp >>> l.java:607) >>> at parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl >>> .java:351) >>> at parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(C >>> olumnReadStoreImpl.java:66) >>> at parquet.column.impl.ColumnReadStoreImpl.getColumnReader(Colu >>> mnReadStoreImpl.java:61) >>> at parquet.io.RecordReaderImplementation.<init>(RecordReaderImp >>> lementation.java:270) >>> at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:134) >>> at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:99) >>> at parquet.filter2.compat.FilterCompat$NoOpFilter.accept(Filter >>> Compat.java:154) >>> at parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:99) >>> at parquet.hadoop.InternalParquetRecordReader.checkRead(Interna >>> lParquetRecordReader.java:137) >>> at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(Inte >>> rnalParquetRecordReader.java:208) >>> at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecor >>> dReader.java:201) >>> at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReade >>> rWrapper.<init>(ParquetRecordReaderWrapper.java:122) >>> at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReade >>> rWrapper.<init>(ParquetRecordReaderWrapper.java:85) >>> at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputForma >>> t.getRecordReader(MapredParquetInputFormat.java:72) >>> at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputForma >>> tSplit.getRecordReader(FetchOperator.java:673) >>> at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader >>> (FetchOperator.java:323) >>> at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(Fetc >>> hOperator.java:445) >>> at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOp >>> erator.java:414) >>> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) >>> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670) >>> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriv >>> er.java:233) >>> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) >>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) >>> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver >>> .java:736) >>> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) >>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >>> ssorImpl.java:62) >>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >>> thodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:498) >>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) >>> Feb 11, 2018 3:14:06 AM WARNING: parquet.hadoop.ParquetRecordReader: >>> Can not initialize counter due to context is not a instance of >>> TaskInputOutputContext, but is org.apache.hadoop.mapreduce.ta >>> sk.TaskAttemptContextImpl >>> Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: >>> RecordReader initialized will read a total of 36635 records. >>> Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: >>> at row 0. reading next block >>> Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: >>> block read in memory in 27 ms. row count = 36635 >>> >>> >>> On Sun, Feb 11, 2018 at 3:10 AM, Deepak Sharma <deepakmc...@gmail.com> >>> wrote: >>> >>>> There was a typo: >>>> Instead of : >>>> alter table mine set locations "hdfs://localhost:8020/user/hi >>>> ve/warehouse/mine"; >>>> >>>> Use : >>>> alter table mine set location "hdfs://localhost:8020/user/hi >>>> ve/warehouse/mine"; >>>> >>>> On Sun, Feb 11, 2018 at 1:38 PM, Deepak Sharma <deepakmc...@gmail.com> >>>> wrote: >>>> >>>>> Try this in hive: >>>>> alter table mine set locations "hdfs://localhost:8020/user/hi >>>>> ve/warehouse/mine"; >>>>> >>>>> Thanks >>>>> Deepak >>>>> >>>>> On Sun, Feb 11, 2018 at 1:24 PM, ☼ R Nair (रविशंकर नायर) < >>>>> ravishankar.n...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> Here you go: >>>>>> >>>>>> hive> show create table mine; >>>>>> OK >>>>>> CREATE TABLE `mine`( >>>>>> `policyid` int, >>>>>> `statecode` string, >>>>>> `socialid` string, >>>>>> `county` string, >>>>>> `eq_site_limit` decimal(10,2), >>>>>> `hu_site_limit` decimal(10,2), >>>>>> `fl_site_limit` decimal(10,2), >>>>>> `fr_site_limit` decimal(10,2), >>>>>> `tiv_2014` decimal(10,2), >>>>>> `tiv_2015` decimal(10,2), >>>>>> `eq_site_deductible` int, >>>>>> `hu_site_deductible` int, >>>>>> `fl_site_deductible` int, >>>>>> `fr_site_deductible` int, >>>>>> `latitude` decimal(6,6), >>>>>> `longitude` decimal(6,6), >>>>>> `line` string, >>>>>> `construction` string, >>>>>> `point_granularity` int) >>>>>> ROW FORMAT SERDE >>>>>> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' >>>>>> WITH SERDEPROPERTIES ( >>>>>> 'path'='hdfs://localhost:8020/user/hive/warehouse/mine') >>>>>> STORED AS INPUTFORMAT >>>>>> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' >>>>>> OUTPUTFORMAT >>>>>> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' >>>>>> LOCATION >>>>>> 'file:/Users/ravishankarnair/spark-warehouse/mine' >>>>>> TBLPROPERTIES ( >>>>>> 'spark.sql.sources.provider'='parquet', >>>>>> 'spark.sql.sources.schema.numParts'='1', >>>>>> 'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fi >>>>>> elds\":[{\"name\":\"policyid\",\"type\":\"integer\",\"nullab >>>>>> le\":true,\"metadata\":{\"name\":\"policyid\",\"scale\":0}}, >>>>>> {\"name\":\"statecode\",\"type\":\"string\",\"nullable\":tru >>>>>> e,\"metadata\":{\"name\":\"statecode\",\"scale\":0}},{\"name >>>>>> \":\"Socialid\",\"type\":\"string\",\"nullable\":true,\"meta >>>>>> data\":{\"name\":\"Socialid\",\"scale\":0}},{\"name\":\"coun >>>>>> ty\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\" >>>>>> name\":\"county\",\"scale\":0}},{\"name\":\"eq_site_limit\", >>>>>> \"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{ >>>>>> \"name\":\"eq_site_limit\",\"scale\":2}},{\"name\":\"hu_site >>>>>> _limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\" >>>>>> metadata\":{\"name\":\"hu_site_limit\",\"scale\":2}},{\" >>>>>> name\":\"fl_site_limit\",\"type\":\"decimal(10,2)\",\"nullab >>>>>> le\":true,\"metadata\":{\"name\":\"fl_site_limit\",\"sca >>>>>> le\":2}},{\"name\":\"fr_site_limit\",\"type\":\"decimal(10, >>>>>> 2)\",\"nullable\":true,\"metadata\":{\"name\":\"fr_ >>>>>> site_limit\",\"scale\":2}},{\"name\":\"tiv_2014\",\"type\":\ >>>>>> "decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\" >>>>>> tiv_2014\",\"scale\":2}},{\"name\":\"tiv_2015\",\"type\":\ >>>>>> "decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\" >>>>>> tiv_2015\",\"scale\":2}},{\"name\":\"eq_site_deductible\", >>>>>> \"type\":\"integer\",\"nullable\":true,\"metadata\":{ >>>>>> \"name\":\"eq_site_deductible\",\"scale\":0}},{\"name\":\" >>>>>> hu_site_deductible\",\"type\":\"integer\",\"nullable\":true, >>>>>> \"metadata\":{\"name\":\"hu_site_deductible\",\"scale\":0} >>>>>> },{\"name\":\"fl_site_deductible\",\"type\":\" >>>>>> integer\",\"nullable\":true,\"metadata\":{\"name\":\"fl_ >>>>>> site_deductible\",\"scale\":0}},{\"name\":\"fr_site_ >>>>>> deductible\",\"type\":\"integer\",\"nullable\":true,\"metada >>>>>> ta\":{\"name\":\"fr_site_deductible\",\"scale\":0}},{\" >>>>>> name\":\"latitude\",\"type\":\"decimal(6,6)\",\"nullable\":t >>>>>> rue,\"metadata\":{\"name\":\"latitude\",\"scale\":6}},{\"nam >>>>>> e\":\"longitude\",\"type\":\"decimal(6,6)\",\"nullable\":tr >>>>>> ue,\"metadata\":{\"name\":\"longitude\",\"scale\":6}},{\"nam >>>>>> e\":\"line\",\"type\":\"string\",\"nullable\":true,\"metadat >>>>>> a\":{\"name\":\"line\",\"scale\":0}},{\"name\":\"constr >>>>>> uction\",\"type\":\"string\",\"nullable\":true,\"metadata\": >>>>>> {\"name\":\"construction\",\"scale\":0}},{\"name\":\"point_ >>>>>> granularity\",\"type\":\"integer\",\"nullable\":true,\" >>>>>> metadata\":{\"name\":\"point_granularity\",\"scale\":0}}]}', >>>>>> 'transient_lastDdlTime'='1518335598') >>>>>> Time taken: 0.13 seconds, Fetched: 35 row(s) >>>>>> >>>>>> On Sun, Feb 11, 2018 at 2:36 AM, Shmuel Blitz < >>>>>> shmuel.bl...@similarweb.com> wrote: >>>>>> >>>>>>> Please run the following command, and paste the result: >>>>>>> SHOW CREATE TABLE <<TABLE-NAME>> >>>>>>> >>>>>>> On Sun, Feb 11, 2018 at 7:56 AM, ☼ R Nair (रविशंकर नायर) < >>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>> >>>>>>>> No, No luck. >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> On Sun, Feb 11, 2018 at 12:48 AM, Deepak Sharma < >>>>>>>> deepakmc...@gmail.com> wrote: >>>>>>>> >>>>>>>>> In hive cli: >>>>>>>>> msck repair table 《table_name》; >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Deepak >>>>>>>>> >>>>>>>>> On Feb 11, 2018 11:14, "☼ R Nair (रविशंकर नायर)" < >>>>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> NO, can you pease explain the command ? Let me try now. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> On Sun, Feb 11, 2018 at 12:40 AM, Deepak Sharma < >>>>>>>>>> deepakmc...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> I am not sure about the exact issue bjt i see you are partioning >>>>>>>>>>> while writing from spark. >>>>>>>>>>> Did you tried msck repair on the table before reading it in hive >>>>>>>>>>> ? >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> Deepak >>>>>>>>>>> >>>>>>>>>>> On Feb 11, 2018 11:06, "☼ R Nair (रविशंकर नायर)" < >>>>>>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> All, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the inputs. Again I am not successful. I think, we >>>>>>>>>>>> need to resolve this, as this is a very common requirement. >>>>>>>>>>>> >>>>>>>>>>>> Please go through my complete code: >>>>>>>>>>>> >>>>>>>>>>>> STEP 1: Started Spark shell as spark-shell --master yarn >>>>>>>>>>>> >>>>>>>>>>>> STEP 2: Flowing code is being given as inout to shark shell >>>>>>>>>>>> >>>>>>>>>>>> import org.apache.spark.sql.Row >>>>>>>>>>>> import org.apache.spark.sql.SparkSession >>>>>>>>>>>> val warehouseLocation ="/user/hive/warehouse" >>>>>>>>>>>> >>>>>>>>>>>> val spark = SparkSession.builder().appName("Spark Hive >>>>>>>>>>>> Example").config("spark.sql.warehouse.dir", >>>>>>>>>>>> warehouseLocation).enableHiveSupport().getOrCreate() >>>>>>>>>>>> >>>>>>>>>>>> import org.apache.spark.sql._ >>>>>>>>>>>> var passion_df = spark.read. >>>>>>>>>>>> format("jdbc"). >>>>>>>>>>>> option("url", "jdbc:mysql://localhost:3307/policies"). >>>>>>>>>>>> option("driver" ,"com.mysql.jdbc.Driver"). >>>>>>>>>>>> option("user", "root"). >>>>>>>>>>>> option("password", "root"). >>>>>>>>>>>> option("dbtable", "insurancedetails"). >>>>>>>>>>>> option("partitionColumn", "policyid"). >>>>>>>>>>>> option("lowerBound", "1"). >>>>>>>>>>>> option("upperBound", "100000"). >>>>>>>>>>>> option("numPartitions", "4"). >>>>>>>>>>>> load() >>>>>>>>>>>> //Made sure that passion_df is created, as passion_df.show(5) >>>>>>>>>>>> shows me correct data. >>>>>>>>>>>> passion_df.write.saveAsTable("default.mine") //Default parquet >>>>>>>>>>>> >>>>>>>>>>>> STEP 3: Went to HIVE. Started HIVE prompt. >>>>>>>>>>>> >>>>>>>>>>>> hive> show tables; >>>>>>>>>>>> OK >>>>>>>>>>>> callcentervoicelogs >>>>>>>>>>>> mine >>>>>>>>>>>> Time taken: 0.035 seconds, Fetched: 2 row(s) >>>>>>>>>>>> //As you can see HIVE is showing the table "mine" in default >>>>>>>>>>>> schema. >>>>>>>>>>>> >>>>>>>>>>>> STEP 4: HERE IS THE PROBLEM. >>>>>>>>>>>> >>>>>>>>>>>> hive> select * from mine; >>>>>>>>>>>> OK >>>>>>>>>>>> Time taken: 0.354 seconds >>>>>>>>>>>> hive> >>>>>>>>>>>> //Where is the data ??? >>>>>>>>>>>> >>>>>>>>>>>> STEP 5: >>>>>>>>>>>> >>>>>>>>>>>> See the below command on HIVE >>>>>>>>>>>> >>>>>>>>>>>> describe formatted mine; >>>>>>>>>>>> OK >>>>>>>>>>>> # col_name data_type comment >>>>>>>>>>>> >>>>>>>>>>>> policyid int >>>>>>>>>>>> statecode string >>>>>>>>>>>> socialid string >>>>>>>>>>>> county string >>>>>>>>>>>> eq_site_limit decimal(10,2) >>>>>>>>>>>> hu_site_limit decimal(10,2) >>>>>>>>>>>> fl_site_limit decimal(10,2) >>>>>>>>>>>> fr_site_limit decimal(10,2) >>>>>>>>>>>> tiv_2014 decimal(10,2) >>>>>>>>>>>> tiv_2015 decimal(10,2) >>>>>>>>>>>> eq_site_deductible int >>>>>>>>>>>> hu_site_deductible int >>>>>>>>>>>> fl_site_deductible int >>>>>>>>>>>> fr_site_deductible int >>>>>>>>>>>> latitude decimal(6,6) >>>>>>>>>>>> longitude decimal(6,6) >>>>>>>>>>>> line string >>>>>>>>>>>> construction string >>>>>>>>>>>> point_granularity int >>>>>>>>>>>> >>>>>>>>>>>> # Detailed Table Information >>>>>>>>>>>> Database: default >>>>>>>>>>>> Owner: ravishankarnair >>>>>>>>>>>> CreateTime: Sun Feb 11 00:26:40 EST 2018 >>>>>>>>>>>> LastAccessTime: UNKNOWN >>>>>>>>>>>> Protect Mode: None >>>>>>>>>>>> Retention: 0 >>>>>>>>>>>> Location: file:/Users/ravishankarnair/sp >>>>>>>>>>>> ark-warehouse/mine >>>>>>>>>>>> Table Type: MANAGED_TABLE >>>>>>>>>>>> Table Parameters: >>>>>>>>>>>> spark.sql.sources.provider parquet >>>>>>>>>>>> spark.sql.sources.schema.numParts 1 >>>>>>>>>>>> spark.sql.sources.schema.part.0 {\"type\":\"struct\",\"fields\ >>>>>>>>>>>> ":[{\"name\":\"policyid\",\"type\":\"integer\",\"nullable\": >>>>>>>>>>>> true,\"metadata\":{\"name\":\"policyid\",\"scale\":0}},{\"na >>>>>>>>>>>> me\":\"statecode\",\"type\":\"string\",\"nullable\":true,\"m >>>>>>>>>>>> etadata\":{\"name\":\"statecode\",\"scale\":0}},{\"name\":\" >>>>>>>>>>>> Socialid\",\"type\":\"string\",\"nullable\":true,\"metadata\ >>>>>>>>>>>> ":{\"name\":\"Socialid\",\"scale\":0}},{\"name\":\"county\", >>>>>>>>>>>> \"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\" >>>>>>>>>>>> :\"county\",\"scale\":0}},{\"name\":\"eq_site_limit\",\"type >>>>>>>>>>>> \":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\ >>>>>>>>>>>> ":\"eq_site_limit\",\"scale\":2}},{\"name\":\"hu_site_limit\ >>>>>>>>>>>> ",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\": >>>>>>>>>>>> {\"name\":\"hu_site_limit\",\"scale\":2}},{\"name\":\"fl_sit >>>>>>>>>>>> e_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"met >>>>>>>>>>>> adata\":{\"name\":\"fl_site_limit\",\"scale\":2}},{\"name\": >>>>>>>>>>>> \"fr_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":tr >>>>>>>>>>>> ue,\"metadata\":{\"name\":\"fr_site_limit\",\"scale\":2}},{\ >>>>>>>>>>>> "name\":\"tiv_2014\",\"type\":\"decimal(10,2)\",\"nullable\" >>>>>>>>>>>> :true,\"metadata\":{\"name\":\"tiv_2014\",\"scale\":2}},{\"n >>>>>>>>>>>> ame\":\"tiv_2015\",\"type\":\"decimal(10,2)\",\"nullable\":t >>>>>>>>>>>> rue,\"metadata\":{\"name\":\"tiv_2015\",\"scale\":2}},{\"nam >>>>>>>>>>>> e\":\"eq_site_deductible\",\"type\":\"integer\",\"nullable\" >>>>>>>>>>>> :true,\"metadata\":{\"name\":\"eq_site_deductible\",\"scale\ >>>>>>>>>>>> ":0}},{\"name\":\"hu_site_deductible\",\"type\":\"integer\", >>>>>>>>>>>> \"nullable\":true,\"metadata\":{\"name\":\"hu_site_deductibl >>>>>>>>>>>> e\",\"scale\":0}},{\"name\":\"fl_site_deductible\",\"type\": >>>>>>>>>>>> \"integer\",\"nullable\":true,\"metadata\":{\"name\":\"fl_si >>>>>>>>>>>> te_deductible\",\"scale\":0}},{\"name\":\"fr_site_deductible >>>>>>>>>>>> \",\"type\":\"integer\",\"nullable\":true,\"metadata\":{ >>>>>>>>>>>> \"name\":\"fr_site_deductible\",\"scale\":0}},{\"name\":\" >>>>>>>>>>>> latitude\",\"type\":\"decimal(6,6)\",\"nullable\":true,\" >>>>>>>>>>>> metadata\":{\"name\":\"latitude\",\"scale\":6}},{\"name\":\" >>>>>>>>>>>> longitude\",\"type\":\"decimal(6,6)\",\"nullable\":true,\" >>>>>>>>>>>> metadata\":{\"name\":\"longitude\",\"scale\":6}},{\"name\":\ >>>>>>>>>>>> "line\",\"type\":\"string\",\"nullable\":true,\"metadata\":{ >>>>>>>>>>>> \"name\":\"line\",\"scale\":0}},{\"name\":\"construction\",\ >>>>>>>>>>>> "type\":\"string\",\"nullable\":true,\"metadata\":{\"name\": >>>>>>>>>>>> \"construction\",\"scale\":0}},{\"name\":\"point_granularity >>>>>>>>>>>> \",\"type\":\"integer\",\"nullable\":true,\"metadata\":{ >>>>>>>>>>>> \"name\":\"point_granularity\",\"scale\":0}}]} >>>>>>>>>>>> transient_lastDdlTime 1518326800 >>>>>>>>>>>> >>>>>>>>>>>> # Storage Information >>>>>>>>>>>> SerDe Library: org.apache.hadoop.hive.ql.io.p >>>>>>>>>>>> arquet.serde.ParquetHiveSerDe >>>>>>>>>>>> InputFormat: org.apache.hadoop.hive.ql.io.p >>>>>>>>>>>> arquet.MapredParquetInputFormat >>>>>>>>>>>> OutputFormat: org.apache.hadoop.hive.ql.io.p >>>>>>>>>>>> arquet.MapredParquetOutputFormat >>>>>>>>>>>> Compressed: No >>>>>>>>>>>> Num Buckets: -1 >>>>>>>>>>>> Bucket Columns: [] >>>>>>>>>>>> Sort Columns: [] >>>>>>>>>>>> Storage Desc Params: >>>>>>>>>>>> path hdfs://localhost:8020/user/hiv >>>>>>>>>>>> e/warehouse/mine >>>>>>>>>>>> serialization.format 1 >>>>>>>>>>>> Time taken: 0.077 seconds, Fetched: 48 row(s) >>>>>>>>>>>> >>>>>>>>>>>> Now, I see your advise and support. Whats the issue? Am I doing >>>>>>>>>>>> wrong, it it a bug ? I am using Spark 2.2.1, HIVE 1.2.1, HADOOP >>>>>>>>>>>> 2.7.3. All >>>>>>>>>>>> class path, configuration are set properly. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> >>>>>>>>>>>> Ravion >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Feb 9, 2018 at 1:29 PM, Nicholas Hakobian < >>>>>>>>>>>> nicholas.hakob...@rallyhealth.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Its possible that the format of your table is not compatible >>>>>>>>>>>>> with your version of hive, so Spark saved it in a way such that >>>>>>>>>>>>> only Spark >>>>>>>>>>>>> can read it. When this happens it prints out a very visible >>>>>>>>>>>>> warning letting >>>>>>>>>>>>> you know this has happened. >>>>>>>>>>>>> >>>>>>>>>>>>> We've seen it most frequently when trying to save a parquet >>>>>>>>>>>>> file with a column in date format into a Hive table. In older >>>>>>>>>>>>> versions of >>>>>>>>>>>>> hive, its parquet reader/writer did not support Date formats >>>>>>>>>>>>> (among a >>>>>>>>>>>>> couple others). >>>>>>>>>>>>> >>>>>>>>>>>>> Nicholas Szandor Hakobian, Ph.D. >>>>>>>>>>>>> Staff Data Scientist >>>>>>>>>>>>> Rally Health >>>>>>>>>>>>> nicholas.hakob...@rallyhealth.com >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Feb 9, 2018 at 9:59 AM, Prakash Joshi < >>>>>>>>>>>>> prakashcjos...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Ravi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can you send the result of >>>>>>>>>>>>>> Show create table your_table_name >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>> Prakash >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Feb 9, 2018 8:20 PM, "☼ R Nair (रविशंकर नायर)" < >>>>>>>>>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> All, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It has been three days continuously I am on this issue. Not >>>>>>>>>>>>>>> getting any clue. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Environment: Spark 2.2.x, all configurations are correct. >>>>>>>>>>>>>>> hive-site.xml is in spark's conf. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1) Step 1: I created a data frame DF1 reading a csv file. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2) Did manipulations on DF1. Resulting frame is passion_df. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 3) passion_df.write.format("orc") >>>>>>>>>>>>>>> .saveAsTable("sampledb.passion") >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 4) The metastore shows the hive table., when I do "show >>>>>>>>>>>>>>> tables" in HIVE, I can see table name >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 5) I can't select in HIVE, though I can select from SPARK as >>>>>>>>>>>>>>> spark.sql("select * from sampledb.passion") >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Whats going on here? Please help. Why I am not seeing data >>>>>>>>>>>>>>> from HIVE prompt? >>>>>>>>>>>>>>> The "describe formatted " command on the table in HIVE shows >>>>>>>>>>>>>>> he data is is in default warehouse location ( >>>>>>>>>>>>>>> /user/hive/warehouse) since I >>>>>>>>>>>>>>> set it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am not getting any definite answer anywhere. Many >>>>>>>>>>>>>>> suggestions and answers given in Stackoverflow et al.Nothing >>>>>>>>>>>>>>> really works. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So asking experts here for some light on this, thanks >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> Ravion >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Shmuel Blitz >>>>>>> Big Data Developer >>>>>>> Email: shmuel.bl...@similarweb.com >>>>>>> www.similarweb.com >>>>>>> <https://www.facebook.com/SimilarWeb/> >>>>>>> <https://www.linkedin.com/company/429838/> >>>>>>> <https://twitter.com/similarweb> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks >>>>> Deepak >>>>> www.bigdatabig.com >>>>> www.keosha.net >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks >>>> Deepak >>>> www.bigdatabig.com >>>> www.keosha.net >>>> >>> >>> >>> >>> -- >>> >>> >> >> >> -- >> Thanks >> Deepak >> www.bigdatabig.com >> www.keosha.net >> > > > > -- > Shmuel Blitz > Big Data Developer > Email: shmuel.bl...@similarweb.com > www.similarweb.com > <https://www.facebook.com/SimilarWeb/> > <https://www.linkedin.com/company/429838/> > <https://twitter.com/similarweb> > --