[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Pallapothu Jyothi Swaroop (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364088#comment-16364088
 ] 

Pallapothu Jyothi Swaroop commented on SPARK-23402:
---

Ok I will try with master.

I need to build in my local? 
Or
Where I can get master branch maven dependencies?

> Dataset write method not working as expected for postgresql database
> 
>
> Key: SPARK-23402
> URL: https://issues.apache.org/jira/browse/SPARK-23402
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.2.1
> Environment: PostgreSQL: 9.5.8 (10 + Also same issue)
> OS: Cent OS 7 & Windows 7,8
> JDBC: 9.4-1201-jdbc41
>  
> Spark:  I executed in both 2.1.0 and 2.2.1
> Mode: Standalone
> OS: Windows 7
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Major
> Attachments: Emsku[1].jpg
>
>
> I am using spark dataset write to insert data on postgresql existing table. 
> For this I am using  write method mode as append mode. While using i am 
> getting exception like table already exists. But, I gave option as append 
> mode.
> It's strange. When i change options to sqlserver/oracle append mode is 
> working as expected.
>  
> *Database Properties:*
> {{destinationProps.put("driver", "org.postgresql.Driver"); 
> destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
> destinationProps.put("user", "dbmig");}}
> {{destinationProps.put("password", "dbmig");}}
>  
> *Dataset Write Code:*
> {{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
>  "dqvalue", destinationdbProperties);}} 
>  
>  
> {{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: 
> relation "dqvalue" already exists at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) 
> at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
> org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
> org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
> org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
>  at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
>  at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
>  at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
>  at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) 
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) 
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
> org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
> com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
>  at 
> com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
>  at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32) at 
> com.ads.dqam.Client.main(Client.java:71)}}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Pallapothu Jyothi Swaroop (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363749#comment-16363749
 ] 

Pallapothu Jyothi Swaroop commented on SPARK-23402:
---

I am using 2.2.1 version in my project. So please check with 2.2.1

> Dataset write method not working as expected for postgresql database
> 
>
> Key: SPARK-23402
> URL: https://issues.apache.org/jira/browse/SPARK-23402
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.2.1
> Environment: PostgreSQL: 9.5.8 (10 + Also same issue)
> OS: Cent OS 7 & Windows 7,8
> JDBC: 9.4-1201-jdbc41
>  
> Spark:  I executed in both 2.1.0 and 2.2.1
> Mode: Standalone
> OS: Windows 7
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Major
> Attachments: Emsku[1].jpg
>
>
> I am using spark dataset write to insert data on postgresql existing table. 
> For this I am using  write method mode as append mode. While using i am 
> getting exception like table already exists. But, I gave option as append 
> mode.
> It's strange. When i change options to sqlserver/oracle append mode is 
> working as expected.
>  
> *Database Properties:*
> {{destinationProps.put("driver", "org.postgresql.Driver"); 
> destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
> destinationProps.put("user", "dbmig");}}
> {{destinationProps.put("password", "dbmig");}}
>  
> *Dataset Write Code:*
> {{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
>  "dqvalue", destinationdbProperties);}} 
>  
>  
> {{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: 
> relation "dqvalue" already exists at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) 
> at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
> org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
> org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
> org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
>  at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
>  at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
>  at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
>  at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) 
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) 
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
> org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
> com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
>  at 
> com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
>  at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32) at 
> com.ads.dqam.Client.main(Client.java:71)}}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Pallapothu Jyothi Swaroop (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363726#comment-16363726
 ] 

Pallapothu Jyothi Swaroop commented on SPARK-23402:
---

[~mgaido] please confirm, is table already existed in database? I am getting 
issue only for tables that are already existed in schema.

I tried with Postgres 10, driver 42.2.1 in windows 8. No success.


> Dataset write method not working as expected for postgresql database
> 
>
> Key: SPARK-23402
> URL: https://issues.apache.org/jira/browse/SPARK-23402
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.2.1
> Environment: PostgreSQL: 9.5.8 (10 + Also same issue)
> OS: Cent OS 7 & Windows 7,8
> JDBC: 9.4-1201-jdbc41
>  
> Spark:  I executed in both 2.1.0 and 2.2.1
> Mode: Standalone
> OS: Windows 7
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Major
> Attachments: Emsku[1].jpg
>
>
> I am using spark dataset write to insert data on postgresql existing table. 
> For this I am using  write method mode as append mode. While using i am 
> getting exception like table already exists. But, I gave option as append 
> mode.
> It's strange. When i change options to sqlserver/oracle append mode is 
> working as expected.
>  
> *Database Properties:*
> {{destinationProps.put("driver", "org.postgresql.Driver"); 
> destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
> destinationProps.put("user", "dbmig");}}
> {{destinationProps.put("password", "dbmig");}}
>  
> *Dataset Write Code:*
> {{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
>  "dqvalue", destinationdbProperties);}} 
>  
>  
> {{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: 
> relation "dqvalue" already exists at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) 
> at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
> org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
> org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
> org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
>  at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
>  at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
>  at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
>  at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) 
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) 
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
> org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
> com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
>  at 
> com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
>  at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32) at 
> com.ads.dqam.Client.main(Client.java:71)}}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional 

[jira] [Comment Edited] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-13 Thread Pallapothu Jyothi Swaroop (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363555#comment-16363555
 ] 

Pallapothu Jyothi Swaroop edited comment on SPARK-23402 at 2/14/18 6:48 AM:


[~kevinyu98]

Thanks for checking again. I tested with 9.5.4 Append mode is working with out 
exception.

I analyzed some thing that may use full for you.


Can you check above scala file. 
https://github.com/apache/spark/blob/v2.2.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcRelationProvider.scala


Below is the statement for checking table exists or not. In this statement it 
is failing.
  val tableExists = JdbcUtils.tableExists(conn, options)

But i am not sure. Why it is failing. I executed sql for table exists command 
taken from the postgres dialect. It is executed successfully in database.


was (Author: swaroopp):
Thanks for checking again. I tested with 9.5.4 Append mode is working with out 
exception.

I analyzed some thing that may use full for you.


Can you check above scala file. 
https://github.com/apache/spark/blob/v2.2.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcRelationProvider.scala


Below is the statement for checking table exists or not. In this statement it 
is failing.
  val tableExists = JdbcUtils.tableExists(conn, options)

But i am not sure. Why it is failing. I executed sql for table exists command 
taken from the postgres dialect. It is executed successfully in database.

> Dataset write method not working as expected for postgresql database
> 
>
> Key: SPARK-23402
> URL: https://issues.apache.org/jira/browse/SPARK-23402
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.2.1
> Environment: PostgreSQL: 9.5.8 (10 + Also same issue)
> OS: Cent OS 7 & Windows 7,8
> JDBC: 9.4-1201-jdbc41
>  
> Spark:  I executed in both 2.1.0 and 2.2.1
> Mode: Standalone
> OS: Windows 7
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Major
> Attachments: Emsku[1].jpg
>
>
> I am using spark dataset write to insert data on postgresql existing table. 
> For this I am using  write method mode as append mode. While using i am 
> getting exception like table already exists. But, I gave option as append 
> mode.
> It's strange. When i change options to sqlserver/oracle append mode is 
> working as expected.
>  
> *Database Properties:*
> {{destinationProps.put("driver", "org.postgresql.Driver"); 
> destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
> destinationProps.put("user", "dbmig");}}
> {{destinationProps.put("password", "dbmig");}}
>  
> *Dataset Write Code:*
> {{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
>  "dqvalue", destinationdbProperties);}} 
>  
>  
> {{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: 
> relation "dqvalue" already exists at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) 
> at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
> org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
> org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
> org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
>  at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
>  at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
>  at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> 

[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-13 Thread Pallapothu Jyothi Swaroop (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363555#comment-16363555
 ] 

Pallapothu Jyothi Swaroop commented on SPARK-23402:
---

Thanks for checking again. I tested with 9.5.4 Append mode is working with out 
exception.

I analyzed some thing that may use full for you.


Can you check above scala file. 
https://github.com/apache/spark/blob/v2.2.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcRelationProvider.scala


Below is the statement for checking table exists or not. In this statement it 
is failing.
  val tableExists = JdbcUtils.tableExists(conn, options)

But i am not sure. Why it is failing. I executed sql for table exists command 
taken from the postgres dialect. It is executed successfully in database.

> Dataset write method not working as expected for postgresql database
> 
>
> Key: SPARK-23402
> URL: https://issues.apache.org/jira/browse/SPARK-23402
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.2.1
> Environment: PostgreSQL: 9.5.8 (10 + Also same issue)
> OS: Cent OS 7 & Windows 7,8
> JDBC: 9.4-1201-jdbc41
>  
> Spark:  I executed in both 2.1.0 and 2.2.1
> Mode: Standalone
> OS: Windows 7
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Major
> Attachments: Emsku[1].jpg
>
>
> I am using spark dataset write to insert data on postgresql existing table. 
> For this I am using  write method mode as append mode. While using i am 
> getting exception like table already exists. But, I gave option as append 
> mode.
> It's strange. When i change options to sqlserver/oracle append mode is 
> working as expected.
>  
> *Database Properties:*
> {{destinationProps.put("driver", "org.postgresql.Driver"); 
> destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
> destinationProps.put("user", "dbmig");}}
> {{destinationProps.put("password", "dbmig");}}
>  
> *Dataset Write Code:*
> {{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
>  "dqvalue", destinationdbProperties);}} 
>  
>  
> {{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: 
> relation "dqvalue" already exists at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) 
> at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
> org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
> org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
> org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
>  at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
>  at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
>  at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
>  at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) 
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) 
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
> org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
> com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
>  at 
> 

[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-13 Thread Pallapothu Jyothi Swaroop (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363512#comment-16363512
 ] 

Pallapothu Jyothi Swaroop commented on SPARK-23402:
---

[~kevinyu98] Did you create table before execute above instructions? It will 
throw exception only when table already exists in database. Please run above 
statements again you will get exception and let me know issue replicated or not.

> Dataset write method not working as expected for postgresql database
> 
>
> Key: SPARK-23402
> URL: https://issues.apache.org/jira/browse/SPARK-23402
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.2.1
> Environment: PostgreSQL: 9.5.8 (10 + Also same issue)
> OS: Cent OS 7 & Windows 7,8
> JDBC: 9.4-1201-jdbc41
>  
> Spark:  I executed in both 2.1.0 and 2.2.1
> Mode: Standalone
> OS: Windows 7
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Major
> Attachments: Emsku[1].jpg
>
>
> I am using spark dataset write to insert data on postgresql existing table. 
> For this I am using  write method mode as append mode. While using i am 
> getting exception like table already exists. But, I gave option as append 
> mode.
> It's strange. When i change options to sqlserver/oracle append mode is 
> working as expected.
>  
> *Database Properties:*
> {{destinationProps.put("driver", "org.postgresql.Driver"); 
> destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
> destinationProps.put("user", "dbmig");}}
> {{destinationProps.put("password", "dbmig");}}
>  
> *Dataset Write Code:*
> {{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
>  "dqvalue", destinationdbProperties);}} 
>  
>  
> {{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: 
> relation "dqvalue" already exists at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) 
> at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
> org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
> org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
> org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
>  at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
>  at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
>  at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
>  at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) 
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) 
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
> org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
> com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
>  at 
> com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
>  at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32) at 
> com.ads.dqam.Client.main(Client.java:71)}}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pallapothu Jyothi Swaroop updated SPARK-23402:
--
Description: 
I am using spark dataset write to insert data on postgresql existing table. For 
this I am using  write method mode as append mode. While using i am getting 
exception like table already exists. But, I gave option as append mode.

It's strange. When i change options to sqlserver/oracle append mode is working 
as expected.

 

*Database Properties:*

{{destinationProps.put("driver", "org.postgresql.Driver"); 
destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
destinationProps.put("user", "dbmig");}}

{{destinationProps.put("password", "dbmig");}}

 

*Dataset Write Code:*

{{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
 "dqvalue", destinationdbProperties);}} 

 

 

{{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: relation 
"dqvalue" already exists at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
 at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
 at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
 at 
org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
 at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) 
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) 
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
 at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) at 
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
 at 
com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
 at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32) at 
com.ads.dqam.Client.main(Client.java:71)}}

 

 

 

  was:
I am using spark dataset write to insert data on postgresql existing table. For 
this I am using  write method mode as append mode. While using i am getting 
exception like table already exists. But, I gave option as append mode.

It's strange. When i change options to sqlserver/oracle append mode is working 
as expected.

 

*Database Properties:*

{{destinationProps.put("driver", "org.postgresql.Driver"); 
destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
destinationProps.put("user", "dbmig"); destinationProps.put("password", 
"dbmig");}}

*Dataset Write Code:*

{{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
 "dqvalue", destinationdbProperties);}} 

 

 

{{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: relation 
"dqvalue" already exists at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
 at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
 at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pallapothu Jyothi Swaroop updated SPARK-23402:
--
Description: 
I am using spark dataset write to insert data on postgresql existing table. For 
this I am using  write method mode as append mode. While using i am getting 
exception like table already exists. But, I gave option as append mode.

It's strange. When i change options to sqlserver/oracle append mode is working 
as expected.

 

*Database Properties:*

{{destinationProps.put("driver", "org.postgresql.Driver"); 
destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig"); 
destinationProps.put("user", "dbmig"); destinationProps.put("password", 
"dbmig");}}

*Dataset Write Code:*

{{valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"),
 "dqvalue", destinationdbProperties);}} 

 

 

{{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: relation 
"dqvalue" already exists at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
 at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
 at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
 at 
org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
 at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) 
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) 
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
 at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) at 
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
 at 
com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
 at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32) at 
com.ads.dqam.Client.main(Client.java:71)}}

 

 

 

  was:
I am using spark dataset write to insert data on postgresql existing table. For 
this I am using  write method mode as append mode. While using i am getting 
exception like table already exists. But, I gave option as append mode.

It's strange. When i change options to sqlserver/oracle append mode is working 
as expected.

 

{{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: relation 
"dqvalue" already exists at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
 at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
 at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
 at 

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pallapothu Jyothi Swaroop updated SPARK-23402:
--
Attachment: Emsku[1].jpg

> Dataset write method not working as expected for postgresql database
> 
>
> Key: SPARK-23402
> URL: https://issues.apache.org/jira/browse/SPARK-23402
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.2.1
> Environment: PostgreSQL: 9.5.8 (10 + Also same issue)
> OS: Cent OS 7 & Windows 7,8
> JDBC: 9.4-1201-jdbc41
>  
> Spark:  I executed in both 2.1.0 and 2.2.1
> Mode: Standalone
> OS: Windows 7
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Major
> Attachments: Emsku[1].jpg
>
>
> I am using spark dataset write to insert data on postgresql existing table. 
> For this I am using  write method mode as append mode. While using i am 
> getting exception like table already exists. But, I gave option as append 
> mode.
> It's strange. When i change options to sqlserver/oracle append mode is 
> working as expected.
>  
> {{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: 
> relation "dqvalue" already exists at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
>  at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) 
> at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
> org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
> org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
> org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
> org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
>  at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
>  at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
>  at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
>  at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) 
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) 
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
> org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
> com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
>  at 
> com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
>  at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32) at 
> com.ads.dqam.Client.main(Client.java:71)}}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
Pallapothu Jyothi Swaroop created SPARK-23402:
-

 Summary: Dataset write method not working as expected for 
postgresql database
 Key: SPARK-23402
 URL: https://issues.apache.org/jira/browse/SPARK-23402
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, SQL
Affects Versions: 2.2.1
 Environment: PostgreSQL: 9.5.8 (10 + Also same issue)

OS: Cent OS 7 & Windows 7,8

JDBC: 9.4-1201-jdbc41

 

Spark:  I executed in both 2.1.0 and 2.2.1

Mode: Standalone

OS: Windows 7
Reporter: Pallapothu Jyothi Swaroop


I am using spark dataset write to insert data on postgresql existing table. For 
this I am using  write method mode as append mode. While using i am getting 
exception like table already exists. But, I gave option as append mode.

It's strange. When i change options to sqlserver/oracle append mode is working 
as expected.

 

{{Exception in thread "main" org.postgresql.util.PSQLException: ERROR: relation 
"dqvalue" already exists at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
 at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
 at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297) at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) at 
org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301) at 
org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287) at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264) at 
org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
 at 
org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
 at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) 
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) 
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
 at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609) at 
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at 
org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460) at 
com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
 at 
com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
 at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32) at 
com.ads.dqam.Client.main(Client.java:71)}}

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-16567) how to increase performance of rdbms dataframe.

2016-07-15 Thread Pallapothu Jyothi Swaroop (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15379105#comment-15379105
 ] 

Pallapothu Jyothi Swaroop edited comment on SPARK-16567 at 7/15/16 9:30 AM:


Thanks
what is user@
how i open issue in user@


was (Author: swaroopp):
what is user@
how i open issue in user@

> how to increase performance of rdbms dataframe.
> ---
>
> Key: SPARK-16567
> URL: https://issues.apache.org/jira/browse/SPARK-16567
> Project: Spark
>  Issue Type: Question
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Critical
>
> Hello,
> how to increase performance of rdbms dataframe.
> I need to perform group by on fetched data.
> I performed like this.
> DataFrame jdbcDF = 
> this.SQLCONTEXT.read().format("jdbc").options(options).load();
> Options is map contains db configuration
> DataFrame groupedDataFrame = 
> jdbcDF.groupBy("UNQ_STR").count();
> How i tune this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16567) how to increase performance of rdbms dataframe.

2016-07-15 Thread Pallapothu Jyothi Swaroop (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15379105#comment-15379105
 ] 

Pallapothu Jyothi Swaroop commented on SPARK-16567:
---

what is user@
how i open issue in user@

> how to increase performance of rdbms dataframe.
> ---
>
> Key: SPARK-16567
> URL: https://issues.apache.org/jira/browse/SPARK-16567
> Project: Spark
>  Issue Type: Question
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Critical
>
> Hello,
> how to increase performance of rdbms dataframe.
> I need to perform group by on fetched data.
> I performed like this.
> DataFrame jdbcDF = 
> this.SQLCONTEXT.read().format("jdbc").options(options).load();
> Options is map contains db configuration
> DataFrame groupedDataFrame = 
> jdbcDF.groupBy("UNQ_STR").count();
> How i tune this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-16567) how to increase performance of rdbms dataframe.

2016-07-15 Thread Pallapothu Jyothi Swaroop (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pallapothu Jyothi Swaroop updated SPARK-16567:
--
Summary: how to increase performance of rdbms dataframe.  (was: How to 
palatalize RDBMS dataframe and perform group by.)

> how to increase performance of rdbms dataframe.
> ---
>
> Key: SPARK-16567
> URL: https://issues.apache.org/jira/browse/SPARK-16567
> Project: Spark
>  Issue Type: Question
>Reporter: Pallapothu Jyothi Swaroop
>Priority: Critical
>
> Hello,
> how to increase performance of rdbms dataframe.
> I need to perform group by on fetched data.
> I performed like this.
> DataFrame jdbcDF = 
> this.SQLCONTEXT.read().format("jdbc").options(options).load();
> Options is map contains db configuration
> DataFrame groupedDataFrame = 
> jdbcDF.groupBy("UNQ_STR").count();
> How i tune this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-16567) How to palatalize RDBMS dataframe and perform group by.

2016-07-15 Thread Pallapothu Jyothi Swaroop (JIRA)
Pallapothu Jyothi Swaroop created SPARK-16567:
-

 Summary: How to palatalize RDBMS dataframe and perform group by.
 Key: SPARK-16567
 URL: https://issues.apache.org/jira/browse/SPARK-16567
 Project: Spark
  Issue Type: Question
Reporter: Pallapothu Jyothi Swaroop
Priority: Critical


Hello,

how to increase performance of rdbms dataframe.

I need to perform group by on fetched data.

I performed like this.

DataFrame jdbcDF = 
this.SQLCONTEXT.read().format("jdbc").options(options).load();
Options is map contains db configuration

DataFrame groupedDataFrame = 
jdbcDF.groupBy("UNQ_STR").count();


How i tune this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-16565) Implementation for processing 50-70 GB data using java......

2016-07-15 Thread Pallapothu Jyothi Swaroop (JIRA)
Pallapothu Jyothi Swaroop created SPARK-16565:
-

 Summary: Implementation for processing 50-70 GB data using 
java..
 Key: SPARK-16565
 URL: https://issues.apache.org/jira/browse/SPARK-16565
 Project: Spark
  Issue Type: Question
 Environment: For Development we are using i3 4core Processor, windows 
7 OS, 8GB RAM.
For Production we have cluster with 4 Nodes, i5 4Core Processor, cent os, 16 GB 
RAM.
Reporter: Pallapothu Jyothi Swaroop


Hello,

I need the implementation and configuration steps for implementing following 
requirement

I need to do analysis for columns on rdbms tables.

Steps:

Step - 1: Load required column from table, existed in rdbms(Oracle).
Step - 2: Group by those data.
Step - 3: Do analysis on group by data using udfs.
Step - 4: Persist analyzed data to hive or mongodb(Please give suggestion for 
choosing this). 

I followed following steps but those have performance issues.

1. Loaded column data form rdbms to Dataframe.
DataFrame jdbcDF = 
this.SQLCONTEXT.read().format("jdbc").options(options).load();
Options is map contains db configuration

2. Grouped that data
DataFrame groupedDataFrame = 
jdbcDF.groupBy("UNQ_STR").count();

3. Performing required analysis on data using udfs like length(i am running 
7UDFS), which returns another dataframe.
I used spark sql for applying udfs

4. Saving step 3 dataframe to hive or mongodb.
For Hihe: used hivecontest.sql("insert ..");
For MongoDb: Used MongoSpark Api

Memory measurements: 

Column A has 50 GB(1Billing rows) data before analysis.
After analysis completed, it may extend by 120 GB - 200 GB based on unique 
values.

Performance measurements for single node with different components as follows: 

1. RDBMS to spark to hive -- 50Hrs
2. RDBMS to spark to mongodb -- 35Hrs
3. Sqoop to hive to spark to mongodb -- 30Hrs

How i increase the performance in cluster based on above requirement. Please 
provide steps for implementation. I need to process those data in 1Hrs.

I am doing this requirement from last 45 days. i cant increase the performance 
please help.

For testing i am using 4node cluster.

thanks and kind regards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org