[ https://issues.apache.org/jira/browse/SPARK-11475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990310#comment-14990310 ]
Rekha Joshi commented on SPARK-11475: ------------------------------------- Great [~zhangxiongfei] Glad you do not have issue with dataframe saveAsTable() for hive as you raised earlier. Please note saveAsParquetFile() gives flexibility to save to exact path as needed.For hive it refers back to your hive metastore settings.This is general hive/HA/setup behavior, not spark related. {code} You can check your hive metastore by hive commands: hive --service metatool –listFSRoot To check what modify might change to do a dry run: hive --service metatool –updateLocation <nameservice-uri> <namenode-uri> - dryRun and then update without – dryRun {code} If you make changes on core/hdfs/hive config xml files, you would need to restart metastore.Please also refer to your hadoop provider setup docs. [~srowen]: If you and [~zhangxiongfei] agree, can we mark this issue closed?thanks. > DataFrame API saveAsTable() does not work well for HDFS HA > ---------------------------------------------------------- > > Key: SPARK-11475 > URL: https://issues.apache.org/jira/browse/SPARK-11475 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.1 > Environment: Hadoop 2.4 & Spark 1.5.1 > Reporter: zhangxiongfei > Attachments: dataFrame_saveAsTable.txt, hdfs-site.xml, hive-site.xml > > > I was trying to save a DF to Hive using following code: > {quote} > sqlContext.range(1L,1000L,2L,2).coalesce(1).saveAsTable("dataframeTable") > {quote} > But got below exception: > {quote} > arning: there were 1 deprecation warning(s); re-run with -deprecation for > details > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1610) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1193) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3516) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:785) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo( > {quote} > *My Hive configuration is* : > {quote} > <property> > <name>hive.metastore.warehouse.dir</name> > <value>*/apps/hive/warehouse*</value> > </property> > {quote} > It seems that the hdfs HA is not configured,then I tried below code: > {quote} > sqlContext.range(1L,1000L,2L,2).coalesce(1).saveAsParquetFile("hdfs://bitautodmp/apps/hive/warehouse/dataframeTable") > {quote} > I could verified that API *saveAsParquetFile* worked well by following > commands: > {quote} > *hadoop fs -ls /apps/hive/warehouse/dataframeTable* > Found 4 items > -rw-r--r-- 3 zhangxf hdfs 0 2015-11-03 17:57 > */apps/hive/warehouse/dataframeTable/_SUCCESS* > -rw-r--r-- 3 zhangxf hdfs 199 2015-11-03 17:57 > */apps/hive/warehouse/dataframeTable/_common_metadata* > -rw-r--r-- 3 zhangxf hdfs 325 2015-11-03 17:57 > */apps/hive/warehouse/dataframeTable/_metadata* > -rw-r--r-- 3 zhangxf hdfs 1098 2015-11-03 17:57 > */apps/hive/warehouse/dataframeTable/part-r-00000-a05a9bf3-b2a6-40e5-b180-818efb2a0f54.gz.parquet* > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org