[jira] [Commented] (SPARK-11475) DataFrame API saveAsTable() does not work well for HDFS HA

Rekha Joshi (JIRA) Wed, 04 Nov 2015 12:14:58 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-11475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990310#comment-14990310
 ]


Rekha Joshi commented on SPARK-11475:
-------------------------------------

Great [~zhangxiongfei] Glad you do not have issue with dataframe saveAsTable() 
for hive as you raised earlier.

Please note saveAsParquetFile() gives flexibility to save to exact path as 
needed.For hive it refers back to your hive metastore settings.This is general 
hive/HA/setup behavior, not spark related.
{code}
You can check your hive metastore by hive commands: hive --service metatool 
–listFSRoot 
To check what modify might change to do a dry run: hive --service metatool 
–updateLocation <nameservice-uri> <namenode-uri> - dryRun and then update 
without – dryRun
{code}
If you make changes on core/hdfs/hive config xml files, you would need to 
restart metastore.Please also refer to your hadoop provider setup docs.

[~srowen]: If you and [~zhangxiongfei] agree, can we mark this issue 
closed?thanks.

> DataFrame API saveAsTable() does not work well for HDFS HA
> ----------------------------------------------------------
>
>                 Key: SPARK-11475
>                 URL: https://issues.apache.org/jira/browse/SPARK-11475
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.1
>         Environment: Hadoop 2.4 & Spark 1.5.1
>            Reporter: zhangxiongfei
>         Attachments: dataFrame_saveAsTable.txt, hdfs-site.xml, hive-site.xml
>
>
> I was trying to save a DF to Hive using following code:
> {quote}
> sqlContext.range(1L,1000L,2L,2).coalesce(1).saveAsTable("dataframeTable")
> {quote}
> But got below exception:
> {quote}
> arning: there were 1 deprecation warning(s); re-run with -deprecation for 
> details
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1610)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1193)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3516)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:785)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(
> {quote}
> *My Hive configuration is* :
> {quote}
>    <property>
>       <name>hive.metastore.warehouse.dir</name>
>       <value>*/apps/hive/warehouse*</value>
>     </property>
> {quote}
> It seems that the hdfs HA is not configured,then I tried below code:
> {quote}
> sqlContext.range(1L,1000L,2L,2).coalesce(1).saveAsParquetFile("hdfs://bitautodmp/apps/hive/warehouse/dataframeTable")
> {quote}
> I could verified that  API *saveAsParquetFile* worked well by following 
> commands:
> {quote}
> *hadoop fs -ls /apps/hive/warehouse/dataframeTable*
> Found 4 items
> -rw-r--r--   3 zhangxf hdfs          0 2015-11-03 17:57 
> */apps/hive/warehouse/dataframeTable/_SUCCESS*
> -rw-r--r--   3 zhangxf hdfs        199 2015-11-03 17:57 
> */apps/hive/warehouse/dataframeTable/_common_metadata*
> -rw-r--r--   3 zhangxf hdfs        325 2015-11-03 17:57 
> */apps/hive/warehouse/dataframeTable/_metadata*
> -rw-r--r--   3 zhangxf hdfs       1098 2015-11-03 17:57 
> */apps/hive/warehouse/dataframeTable/part-r-00000-a05a9bf3-b2a6-40e5-b180-818efb2a0f54.gz.parquet*
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-11475) DataFrame API saveAsTable() does not work well for HDFS HA

Reply via email to