[jira] [Commented] (SPARK-27723) Unable to pull the oracle table data using patitionColumn date/timeStamp
[ https://issues.apache.org/jira/browse/SPARK-27723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843432#comment-16843432 ] Shyama commented on SPARK-27723: [~yumwang] any help plz [https://stackoverflow.com/questions/56208362/numberformatexception-thrown-when-passed-date-as-lowerbound-upperbound-in-spark] > Unable to pull the oracle table data using patitionColumn date/timeStamp > > > Key: SPARK-27723 > URL: https://issues.apache.org/jira/browse/SPARK-27723 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.1 > Environment: Me using spark-sql-2.4.1v , java8 . ojdbc6.jar >Reporter: Shyama >Priority: Major > > > Reproduction steps : > 1. create oracle table > create table schema1.modal_vals( > FAMILY_ID NOT NULL NUMBER, > INSERTION_DATE NOT NULL DATE, > ITEM_VALUE VARCHAR2(4000), > YEAR NUMBER, > QUARTER NUMBER, > LAST_UPDATE_DATE DATE > ) > 3. insert data into oracle table with date column format like "30-JUN-02" > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-17","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","bb",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-OCT-13","a-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-03","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-FEB-03","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JAN-19","aaa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-18","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"01-MAY-19","bb-",2013,2,null); > > > > //please fill the respected oracle details > DataFrameReader ora_df_reader = spark.read().format("jdbc") > .option("url", o_url) > .option("driver", Constants.ORACLE_DRIVER) > .option("user", o_userName) > .option("password", o_passwd) > .option("fetchsize",1000); > > Dataset ss = ora_df_reader > .option("inferSchema", true) > .option("schema","schema1") > .option("numPartitions", 20); > .option("partitionColumn", "INSERTION_DATE"); > .option("lowerBound", "2002-03-31" ) > .option("upperBound", "2019-05-01") > .option("dateFormat", "-MM-dd" )// Tried all "-mm-dd" ,"-MM-dd" > "-MM-DD" "DD-MMM-YY" "dd-MMM-yy" > .option("dbtable", "select * from schema1.modal_vals") > .load(); > > > Error : > {{java.sql.SQLException: ORA-12801: error signaled in parallel query server > P001(2) ORA-01861: literal does not match format string}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27723) Unable to pull the oracle table data using patitionColumn date/timeStamp
[ https://issues.apache.org/jira/browse/SPARK-27723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843425#comment-16843425 ] Shyama commented on SPARK-27723: [~yumwang] When I run in the spark-cluster , same thing giving a number format error ... should I set parameter in spark cluster in specific interesting it is running fine in my local IDE without any error. java.lang.NumberFormatException: For input string: "2002-03-31" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:589) at java.lang.Long.parseLong(Long.java:631) at scala.collection.immutable.StringLike$class.toLong(StringLike.scala:276) at scala.collection.immutable.StringOps.toLong(StringOps.scala:29) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$9.apply(JDBCOptions.scala:98) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$9.apply(JDBCOptions.scala:98) at scala.Option.map(Option.scala:146) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:98) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:35) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) > Unable to pull the oracle table data using patitionColumn date/timeStamp > > > Key: SPARK-27723 > URL: https://issues.apache.org/jira/browse/SPARK-27723 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.1 > Environment: Me using spark-sql-2.4.1v , java8 . ojdbc6.jar >Reporter: Shyama >Priority: Major > > > Reproduction steps : > 1. create oracle table > create table schema1.modal_vals( > FAMILY_ID NOT NULL NUMBER, > INSERTION_DATE NOT NULL DATE, > ITEM_VALUE VARCHAR2(4000), > YEAR NUMBER, > QUARTER NUMBER, > LAST_UPDATE_DATE DATE > ) > 3. insert data into oracle table with date column format like "30-JUN-02" > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-17","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","bb",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-OCT-13","a-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-03","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-FEB-03","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JAN-19","aaa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-18","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"01-MAY-19","bb-",2013,2,null); > > > > //please fill the respected oracle details > DataFrameReader ora_df_reader = spark.read().format("jdbc") > .option("url", o_url) > .option("driver", Constants.ORACLE_DRIVER) > .option("user", o_userName) > .option("password", o_passwd) > .option("fetchsize",1000); > > Dataset ss = ora_df_reader > .option("inferSchema", true) > .option("schema","schema1") > .option("numPartitions", 20); > .option("partitionColumn", "INSERTION_DATE"); > .option("lowerBound", "2002-03-31" ) > .option("upperBound",
[jira] [Commented] (SPARK-27723) Unable to pull the oracle table data using patitionColumn date/timeStamp
[ https://issues.apache.org/jira/browse/SPARK-27723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843422#comment-16843422 ] Shyama commented on SPARK-27723: [~smilegator] sure , please let me know how to do it i.e submit a PR ? > Unable to pull the oracle table data using patitionColumn date/timeStamp > > > Key: SPARK-27723 > URL: https://issues.apache.org/jira/browse/SPARK-27723 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.1 > Environment: Me using spark-sql-2.4.1v , java8 . ojdbc6.jar >Reporter: Shyama >Priority: Major > > > Reproduction steps : > 1. create oracle table > create table schema1.modal_vals( > FAMILY_ID NOT NULL NUMBER, > INSERTION_DATE NOT NULL DATE, > ITEM_VALUE VARCHAR2(4000), > YEAR NUMBER, > QUARTER NUMBER, > LAST_UPDATE_DATE DATE > ) > 3. insert data into oracle table with date column format like "30-JUN-02" > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-17","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","bb",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-OCT-13","a-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-03","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-FEB-03","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JAN-19","aaa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-18","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"01-MAY-19","bb-",2013,2,null); > > > > //please fill the respected oracle details > DataFrameReader ora_df_reader = spark.read().format("jdbc") > .option("url", o_url) > .option("driver", Constants.ORACLE_DRIVER) > .option("user", o_userName) > .option("password", o_passwd) > .option("fetchsize",1000); > > Dataset ss = ora_df_reader > .option("inferSchema", true) > .option("schema","schema1") > .option("numPartitions", 20); > .option("partitionColumn", "INSERTION_DATE"); > .option("lowerBound", "2002-03-31" ) > .option("upperBound", "2019-05-01") > .option("dateFormat", "-MM-dd" )// Tried all "-mm-dd" ,"-MM-dd" > "-MM-DD" "DD-MMM-YY" "dd-MMM-yy" > .option("dbtable", "select * from schema1.modal_vals") > .load(); > > > Error : > {{java.sql.SQLException: ORA-12801: error signaled in parallel query server > P001(2) ORA-01861: literal does not match format string}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27723) Unable to pull the oracle table data using patitionColumn date/timeStamp
[ https://issues.apache.org/jira/browse/SPARK-27723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840456#comment-16840456 ] Shyama commented on SPARK-27723: [~yumwang] by the way when I tried with timestamp. I used .option("customSchema","INSERT_DATE TIMESTAMP") But still it was thowing error. Anyhow solved the issue with DATE but TIMESTAMP is pending. > Unable to pull the oracle table data using patitionColumn date/timeStamp > > > Key: SPARK-27723 > URL: https://issues.apache.org/jira/browse/SPARK-27723 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.1 > Environment: Me using spark-sql-2.4.1v , java8 . ojdbc6.jar >Reporter: Shyama >Priority: Major > > > Reproduction steps : > 1. create oracle table > create table schema1.modal_vals( > FAMILY_ID NOT NULL NUMBER, > INSERTION_DATE NOT NULL DATE, > ITEM_VALUE VARCHAR2(4000), > YEAR NUMBER, > QUARTER NUMBER, > LAST_UPDATE_DATE DATE > ) > 3. insert data into oracle table with date column format like "30-JUN-02" > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-17","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","bb",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-OCT-13","a-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-03","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-FEB-03","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JAN-19","aaa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-18","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"01-MAY-19","bb-",2013,2,null); > > > > //please fill the respected oracle details > DataFrameReader ora_df_reader = spark.read().format("jdbc") > .option("url", o_url) > .option("driver", Constants.ORACLE_DRIVER) > .option("user", o_userName) > .option("password", o_passwd) > .option("fetchsize",1000); > > Dataset ss = ora_df_reader > .option("inferSchema", true) > .option("schema","schema1") > .option("numPartitions", 20); > .option("partitionColumn", "INSERTION_DATE"); > .option("lowerBound", "2002-03-31" ) > .option("upperBound", "2019-05-01") > .option("dateFormat", "-MM-dd" )// Tried all "-mm-dd" ,"-MM-dd" > "-MM-DD" "DD-MMM-YY" "dd-MMM-yy" > .option("dbtable", "select * from schema1.modal_vals") > .load(); > > > Error : > {{java.sql.SQLException: ORA-12801: error signaled in parallel query server > P001(2) ORA-01861: literal does not match format string}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27723) Unable to pull the oracle table data using patitionColumn date/timeStamp
[ https://issues.apache.org/jira/browse/SPARK-27723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840442#comment-16840442 ] Shyama commented on SPARK-27723: [~yumwang] wow ..Below is working fine...thank you so much .option("sessionInitStatement", "ALTER SESSION SET NLS_DATE_FORMAT = '-MM-DD'"); //this was the issue earlier //.option("oracle.jdbc.mapDateToTimestamp","false"); .option("lowerBound", "2002-03-31" ); .option("upperBound", "2019-05-01"); .option("dateFormat", "-MM-dd" ); Why should we need .option("dateFormat", "-MM-dd" ); ? Without .option("oracle.jdbc.mapDateToTimestamp","false"); also it is working fine. Still I need to use ? But why it is not working with timestamp? I tried below .option("lowerBound", "2002-03-31 00:00:00" ); .option("upperBound", "2019-05-01 23:59:59"); .option("timestampFormat", "-mm-dd hh:mm:ss"); .option("sessionInitStatement", "ALTER SESSION SET NLS_DATE_FORMAT = '-MM-DD HH24:MI:SS'"); Error : java.lang.IllegalArgumentException at java.sql.Date.valueOf(Date.java:143) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.toInternalBoundValue(JDBCRelation.scala:178) Anyhow thanks a lot > Unable to pull the oracle table data using patitionColumn date/timeStamp > > > Key: SPARK-27723 > URL: https://issues.apache.org/jira/browse/SPARK-27723 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.1 > Environment: Me using spark-sql-2.4.1v , java8 . ojdbc6.jar >Reporter: Shyama >Priority: Major > > > Reproduction steps : > 1. create oracle table > create table schema1.modal_vals( > FAMILY_ID NOT NULL NUMBER, > INSERTION_DATE NOT NULL DATE, > ITEM_VALUE VARCHAR2(4000), > YEAR NUMBER, > QUARTER NUMBER, > LAST_UPDATE_DATE DATE > ) > 3. insert data into oracle table with date column format like "30-JUN-02" > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-17","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","bb",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-OCT-13","a-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-03","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-FEB-03","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JAN-19","aaa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-18","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"01-MAY-19","bb-",2013,2,null); > > > > //please fill the respected oracle details > DataFrameReader ora_df_reader = spark.read().format("jdbc") > .option("url", o_url) > .option("driver", Constants.ORACLE_DRIVER) > .option("user", o_userName) > .option("password", o_passwd) > .option("fetchsize",1000); > > Dataset ss = ora_df_reader > .option("inferSchema", true) > .option("schema","schema1") > .option("numPartitions", 20); > .option("partitionColumn", "INSERTION_DATE"); > .option("lowerBound", "2002-03-31" ) > .option("upperBound", "2019-05-01") > .option("dateFormat", "-MM-dd" )// Tried all "-mm-dd" ,"-MM-dd" > "-MM-DD" "DD-MMM-YY" "dd-MMM-yy" > .option("dbtable", "select * from schema1.modal_vals") > .load(); > > > Error : > {{java.sql.SQLException: ORA-12801: error signaled in parallel query server > P001(2) ORA-01861: literal does not match format string}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SPARK-27723) Unable to pull the oracle table data using patitionColumn date/timeStamp
[ https://issues.apache.org/jira/browse/SPARK-27723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840345#comment-16840345 ] Shyama commented on SPARK-27723: [~yumwang], thanks for reply, you mean to try ".option("oracle.jdbc.mapDateToTimestamp", "false")" ? Yes I already tried .option("partitionColumn","INSERTION_DATE"); .option("oracle.jdbc.mapDateToTimestamp","false"); .option("lowerBound", "2002-03-31" ); .option("upperBound", "2019-05-01"); .option("dateFormat", "-mm-dd" ); , facing same issue ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.sql.SQLDataException: ORA-01861: literal does not match format string Let me know if I need to any other thing? > Unable to pull the oracle table data using patitionColumn date/timeStamp > > > Key: SPARK-27723 > URL: https://issues.apache.org/jira/browse/SPARK-27723 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.1 > Environment: Me using spark-sql-2.4.1v , java8 . ojdbc6.jar >Reporter: Shyama >Priority: Major > > > Reproduction steps : > 1. create oracle table > create table schema1.modal_vals( > FAMILY_ID NOT NULL NUMBER, > INSERTION_DATE NOT NULL DATE, > ITEM_VALUE VARCHAR2(4000), > YEAR NUMBER, > QUARTER NUMBER, > LAST_UPDATE_DATE DATE > ) > 3. insert data into oracle table with date column format like "30-JUN-02" > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-17","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","bb",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-02","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-OCT-13","a-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-03","bbb-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","b",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-FEB-03","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-13","aa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JAN-19","aaa+",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"30-JUN-18","ccc-",2013,2,null); > insert into > modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) > values(2,"01-MAY-19","bb-",2013,2,null); > > > > //please fill the respected oracle details > DataFrameReader ora_df_reader = spark.read().format("jdbc") > .option("url", o_url) > .option("driver", Constants.ORACLE_DRIVER) > .option("user", o_userName) > .option("password", o_passwd) > .option("fetchsize",1000); > > Dataset ss = ora_df_reader > .option("inferSchema", true) > .option("schema","schema1") > .option("numPartitions", 20); > .option("partitionColumn", "INSERTION_DATE"); > .option("lowerBound", "2002-03-31" ) > .option("upperBound", "2019-05-01") > .option("dateFormat", "-MM-dd" )// Tried all "-mm-dd" ,"-MM-dd" > "-MM-DD" "DD-MMM-YY" "dd-MMM-yy" > .option("dbtable", "select * from schema1.modal_vals") > .load(); > > > Error : > {{java.sql.SQLException: ORA-12801: error signaled in parallel query server > P001(2) ORA-01861: literal does not match format string}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-27723) Unable to pull the oracle table data using patitionColumn date/timeStamp
Shyama created SPARK-27723: -- Summary: Unable to pull the oracle table data using patitionColumn date/timeStamp Key: SPARK-27723 URL: https://issues.apache.org/jira/browse/SPARK-27723 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.1 Environment: Me using spark-sql-2.4.1v , java8 . ojdbc6.jar Reporter: Shyama Fix For: 2.4.0 Reproduction steps : 1. create oracle table create table schema1.modal_vals( FAMILY_ID NOT NULL NUMBER, INSERTION_DATE NOT NULL DATE, ITEM_VALUE VARCHAR2(4000), YEAR NUMBER, QUARTER NUMBER, LAST_UPDATE_DATE DATE ) 3. insert data into oracle table with date column format like "30-JUN-02" insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-02","bbb-",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-13","b+",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-17","bbb-",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-13","bb",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-02","ccc-",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-13","aa-",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-OCT-13","a-",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-03","bbb-",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-13","b",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-FEB-03","aa+",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-13","aa+",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JAN-19","aaa+",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"30-JUN-18","ccc-",2013,2,null); insert into modal_vals(FAMILY_ID,INSERTION_DATE,ITEM_VALUE,YEAR,QUARTER,LAST_UPDATE_DATE) values(2,"01-MAY-19","bb-",2013,2,null); //please fill the respected oracle details DataFrameReader ora_df_reader = spark.read().format("jdbc") .option("url", o_url) .option("driver", Constants.ORACLE_DRIVER) .option("user", o_userName) .option("password", o_passwd) .option("fetchsize",1000); Dataset ss = ora_df_reader .option("inferSchema", true) .option("schema","schema1") .option("numPartitions", 20); .option("partitionColumn", "INSERTION_DATE"); .option("lowerBound", "2002-03-31" ) .option("upperBound", "2019-05-01") .option("dateFormat", "-MM-dd" )// Tried all "-mm-dd" ,"-MM-dd" "-MM-DD" "DD-MMM-YY" "dd-MMM-yy" .option("dbtable", "select * from schema1.modal_vals") .load(); Error : {{java.sql.SQLException: ORA-12801: error signaled in parallel query server P001(2) ORA-01861: literal does not match format string}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20166) Use XXX for ISO timezone instead of ZZ which is FastDateFormat specific in CSV/JSON time related options
[ https://issues.apache.org/jira/browse/SPARK-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1683#comment-1683 ] Shyama commented on SPARK-20166: [~srowen], any clue of this ? [https://stackoverflow.com/questions/56020103/how-to-pass-date-timestamp-as-lowerbound-upperbound-in-spark-sql-2-4-1v] > Use XXX for ISO timezone instead of ZZ which is FastDateFormat specific in > CSV/JSON time related options > > > Key: SPARK-20166 > URL: https://issues.apache.org/jira/browse/SPARK-20166 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Minor > Fix For: 2.2.0 > > > We can use {{XXX}} format instead of {{ZZ}}. {{ZZ}} seems a > {{FastDateFormat}} specific Please see > https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#iso8601timezone > and > https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/FastDateFormat.html > {{ZZ}} supports "ISO 8601 extended format time zones" but it seems > {{FastDateFormat}} specific option. > It seems we better replace {{ZZ}} to {{XXX}} because they look use the same > strategy - > https://github.com/apache/commons-lang/blob/8767cd4f1a6af07093c1e6c422dae8e574be7e5e/src/main/java/org/apache/commons/lang3/time/FastDateParser.java#L930. > > I also checked the codes and manually debugged it for sure. It seems both > cases use the same pattern {code}( Z|(?:[+-]\\d{2}(?::)\\d{2})) {code}. > Note that this is a fix about documentation not the behaviour change because > {{ZZ}} seems invalid date format in {{SimpleDateFormat}} as documented in > {{DataFrameReader}}: > {quote} >* `timestampFormat` (default `-MM-dd'T'HH:mm:ss.SSSZZ`): sets the > string that >* indicates a timestamp format. Custom date formats follow the formats at >* `java.text.SimpleDateFormat`. This applies to timestamp type. > {quote} > {code} > scala> new > java.text.SimpleDateFormat("-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00") > res4: java.util.Date = Tue Mar 21 20:00:00 KST 2017 > scala> new > java.text.SimpleDateFormat("-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z") > res10: java.util.Date = Tue Mar 21 09:00:00 KST 2017 > scala> new > java.text.SimpleDateFormat("-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00") > java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000-11:00" > at java.text.DateFormat.parse(DateFormat.java:366) > ... 48 elided > scala> new > java.text.SimpleDateFormat("-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z") > java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000Z" > at java.text.DateFormat.parse(DateFormat.java:366) > ... 48 elided > {code} > {code} > scala> > org.apache.commons.lang3.time.FastDateFormat.getInstance("-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00") > res7: java.util.Date = Tue Mar 21 20:00:00 KST 2017 > scala> > org.apache.commons.lang3.time.FastDateFormat.getInstance("-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z") > res1: java.util.Date = Tue Mar 21 09:00:00 KST 2017 > scala> > org.apache.commons.lang3.time.FastDateFormat.getInstance("-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00") > res8: java.util.Date = Tue Mar 21 20:00:00 KST 2017 > scala> > org.apache.commons.lang3.time.FastDateFormat.getInstance("-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z") > res2: java.util.Date = Tue Mar 21 09:00:00 KST 2017 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22814) JDBC support date/timestamp type as partitionColumn
[ https://issues.apache.org/jira/browse/SPARK-22814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832420#comment-16832420 ] Shyama commented on SPARK-22814: I am trying the same thing but getting error , so how does partitionColumn infer the DataType it ? [https://stackoverflow.com/questions/55965978/how-to-set-jdbc-partitioncolumn-type-to-date-in-spark-2-4-1?noredirect=1#comment98586394_55965978] > JDBC support date/timestamp type as partitionColumn > --- > > Key: SPARK-22814 > URL: https://issues.apache.org/jira/browse/SPARK-22814 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.2, 2.2.1 >Reporter: Yuechen Chen >Assignee: Takeshi Yamamuro >Priority: Major > Fix For: 2.4.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > In spark, you can partition MySQL queries by partitionColumn. > val df = (spark.read.jdbc(url=jdbcUrl, > table="employees", > columnName="emp_no", > lowerBound=1L, > upperBound=10L, > numPartitions=100, > connectionProperties=connectionProperties)) > display(df) > But, partitionColumn must be a numeric column from the table. > However, there are lots of table, which has no primary key, and has some > date/timestamp indexes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org