[jira] [Commented] (BEAM-4417) BigqueryIO Numeric datatype Support
[ https://issues.apache.org/jira/browse/BEAM-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517885#comment-16517885 ] Kishan Kumar commented on BEAM-4417: Is their any Temp Solution Also Available or Any work is Going on > BigqueryIO Numeric datatype Support > --- > > Key: BEAM-4417 > URL: https://issues.apache.org/jira/browse/BEAM-4417 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Kishan Kumar >Assignee: Chamikara Jayalath >Priority: Critical > Labels: newbie, patch > Fix For: 2.6.0 > > > The BigQueryIO.read fails while parsing the data from the avro file generated > while reading the data from the table which has columns with *Numeric* > datatypes. > We have gone through the source code at Git-Hub and noticed that *Numeric > data type is not yet supported.* > > Caused by: com.google.common.base.VerifyException: Unsupported BigQuery type: > NUMERIC > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4417) BigqueryIO Numeric datatype Support
[ https://issues.apache.org/jira/browse/BEAM-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kishan Kumar updated BEAM-4417: --- Description: The BigQueryIO.read fails while parsing the data from the avro file generated while reading the data from the table which has columns with *Numeric* datatypes. We have gone through the source code at Git-Hub and noticed that *Numeric data type is not yet supported.* Caused by: com.google.common.base.VerifyException: Unsupported BigQuery type: NUMERIC was: The BigQueryIO.read fails while parsing the data from the avro file generated while reading the data from the table which has columns with *Numeric* datatypes. We have gone through the source code at Git-Hub and noticed that *Numeric data type is not yet supported.* > BigqueryIO Numeric datatype Support > --- > > Key: BEAM-4417 > URL: https://issues.apache.org/jira/browse/BEAM-4417 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Kishan Kumar >Assignee: Chamikara Jayalath >Priority: Critical > Labels: newbie, patch > Fix For: 2.6.0 > > > The BigQueryIO.read fails while parsing the data from the avro file generated > while reading the data from the table which has columns with *Numeric* > datatypes. > We have gone through the source code at Git-Hub and noticed that *Numeric > data type is not yet supported.* > > Caused by: com.google.common.base.VerifyException: Unsupported BigQuery type: > NUMERIC > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4417) BigqueryIO Numeric datatype Support
Kishan Kumar created BEAM-4417: -- Summary: BigqueryIO Numeric datatype Support Key: BEAM-4417 URL: https://issues.apache.org/jira/browse/BEAM-4417 Project: Beam Issue Type: Improvement Components: io-java-gcp Affects Versions: 2.4.0 Reporter: Kishan Kumar Assignee: Chamikara Jayalath Fix For: 2.5.0 The BigQueryIO.read fails while parsing the data from the avro file generated while reading the data from the table which has columns with *Numeric* datatypes. We have gone through the source code at Git-Hub and noticed that *Numeric data type is not yet supported.* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3647) Default Coder/Reading Coder From File
[ https://issues.apache.org/jira/browse/BEAM-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424999#comment-16424999 ] Kishan Kumar commented on BEAM-3647: Any Updates > Default Coder/Reading Coder From File > -- > > Key: BEAM-3647 > URL: https://issues.apache.org/jira/browse/BEAM-3647 > Project: Beam > Issue Type: New Feature > Components: beam-model, dsl-sql >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Assignee: Anton Kedin >Priority: Major > > *Requirement*-: Need to Run Template With Same Logics on Different Tables > Data.(Example is Given Below) > > *Need*: Default Coder is Required So According to Data It Make All Fields as > String and Read Data else Thier must be Dynamic Options to Read Coder From > GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location > Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider. > > > *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, > SUB_PRICE) > And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) > > On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can > Read Coder at Run Time The Same Template Can Be Used For Different Tables at > Run Time. > > Such Situations Make Our Work Easy and Make Our job Easy. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3647) Default Coder/Reading Coder From File
[ https://issues.apache.org/jira/browse/BEAM-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366591#comment-16366591 ] Kishan Kumar commented on BEAM-3647: Thanks [~jkff] and Its My Mistake also that I was Not Clear About My need. > Default Coder/Reading Coder From File > -- > > Key: BEAM-3647 > URL: https://issues.apache.org/jira/browse/BEAM-3647 > Project: Beam > Issue Type: New Feature > Components: beam-model, dsl-sql >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Assignee: Anton Kedin >Priority: Major > > *Requirement*-: Need to Run Template With Same Logics on Different Tables > Data.(Example is Given Below) > > *Need*: Default Coder is Required So According to Data It Make All Fields as > String and Read Data else Thier must be Dynamic Options to Read Coder From > GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location > Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider. > > > *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, > SUB_PRICE) > And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) > > On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can > Read Coder at Run Time The Same Template Can Be Used For Different Tables at > Run Time. > > Such Situations Make Our Work Easy and Make Our job Easy. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-3647) Default Coder/Reading Coder From File
[ https://issues.apache.org/jira/browse/BEAM-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365147#comment-16365147 ] Kishan Kumar edited comment on BEAM-3647 at 2/15/18 4:55 AM: - Thanks [~jkff] I need to Specify That PCollection inputTable = PBegin.in(p).apply(Create.of(row1, row2, row3) .*withCoder(type.getRowCoder()*)); The Coder Can We Choose at Run Time Because the Same Query is Going to Run on Different DDL as Shown Above in Example Because Seeing the Above *Case* I have not Found Such Use Case. And the Same Question I have Asked at StackOverflow -: https://stackoverflow.com/questions/47806368/running-beamsql-withoutcoder-or-making-coder-dynamic was (Author: kishank): [~jkff] I need to Specify That PCollection inputTable = PBegin.in(p).apply(Create.of(row1, row2, row3) .*withCoder(type.getRowCoder()*)); The Coder Can We Choose at Run Time Because the Same Query is Going to Run on Different DDL as Shown Above in Example Because Seeing the Above *Case* I have not Found Such Use Case. > Default Coder/Reading Coder From File > -- > > Key: BEAM-3647 > URL: https://issues.apache.org/jira/browse/BEAM-3647 > Project: Beam > Issue Type: New Feature > Components: beam-model, dsl-sql >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Priority: Major > Fix For: 2.1.0 > > > *Requirement*-: Need to Run Template With Same Logics on Different Tables > Data.(Example is Given Below) > > *Need*: Default Coder is Required So According to Data It Make All Fields as > String and Read Data else Thier must be Dynamic Options to Read Coder From > GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location > Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider. > > > *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, > SUB_PRICE) > And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) > > On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can > Read Coder at Run Time The Same Template Can Be Used For Different Tables at > Run Time. > > Such Situations Make Our Work Easy and Make Our job Easy. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3647) Default Coder/Reading Coder From File
[ https://issues.apache.org/jira/browse/BEAM-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365147#comment-16365147 ] Kishan Kumar commented on BEAM-3647: [~jkff] I need to Specify That PCollection inputTable = PBegin.in(p).apply(Create.of(row1, row2, row3) .*withCoder(type.getRowCoder()*)); The Coder Can We Choose at Run Time Because the Same Query is Going to Run on Different DDL as Shown Above in Example Because Seeing the Above *Case* I have not Found Such Use Case. > Default Coder/Reading Coder From File > -- > > Key: BEAM-3647 > URL: https://issues.apache.org/jira/browse/BEAM-3647 > Project: Beam > Issue Type: New Feature > Components: beam-model, dsl-sql >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Priority: Major > Fix For: 2.1.0 > > > *Requirement*-: Need to Run Template With Same Logics on Different Tables > Data.(Example is Given Below) > > *Need*: Default Coder is Required So According to Data It Make All Fields as > String and Read Data else Thier must be Dynamic Options to Read Coder From > GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location > Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider. > > > *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, > SUB_PRICE) > And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) > > On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can > Read Coder at Run Time The Same Template Can Be Used For Different Tables at > Run Time. > > Such Situations Make Our Work Easy and Make Our job Easy. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-3647) Default Coder/Reading Coder From File
[ https://issues.apache.org/jira/browse/BEAM-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361880#comment-16361880 ] Kishan Kumar edited comment on BEAM-3647 at 2/13/18 6:37 AM: - Thanks,[~kenn] and [~kedin] But I want to State That For Different Use Case We are Using Two Different Templates Because We need to Define Codertheirr Only If We can Read Coder at Run Time Then Both Work can Be Done in Single Template Because On Both The SQl Selection is Done on Roll Number Only. was (Author: kishank): Thanks,[~kenn] and [~kedin] > Default Coder/Reading Coder From File > -- > > Key: BEAM-3647 > URL: https://issues.apache.org/jira/browse/BEAM-3647 > Project: Beam > Issue Type: New Feature > Components: beam-model, dsl-sql >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Priority: Major > > *Requirement*-: Need to Run Template With Same Logics on Different Tables > Data.(Example is Given Below) > > *Need*: Default Coder is Required So According to Data It Make All Fields as > String and Read Data else Thier must be Dynamic Options to Read Coder From > GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location > Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider. > > > *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, > SUB_PRICE) > And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) > > On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can > Read Coder at Run Time The Same Template Can Be Used For Different Tables at > Run Time. > > Such Situations Make Our Work Easy and Make Our job Easy. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3647) Default Coder/Reading Coder From File
[ https://issues.apache.org/jira/browse/BEAM-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361880#comment-16361880 ] Kishan Kumar commented on BEAM-3647: Thanks,[~kenn] and [~kedin] > Default Coder/Reading Coder From File > -- > > Key: BEAM-3647 > URL: https://issues.apache.org/jira/browse/BEAM-3647 > Project: Beam > Issue Type: New Feature > Components: beam-model, dsl-sql >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Priority: Major > > *Requirement*-: Need to Run Template With Same Logics on Different Tables > Data.(Example is Given Below) > > *Need*: Default Coder is Required So According to Data It Make All Fields as > String and Read Data else Thier must be Dynamic Options to Read Coder From > GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location > Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider. > > > *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, > SUB_PRICE) > And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) > > On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can > Read Coder at Run Time The Same Template Can Be Used For Different Tables at > Run Time. > > Such Situations Make Our Work Easy and Make Our job Easy. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3647) Default Coder/Reading Coder From File
[ https://issues.apache.org/jira/browse/BEAM-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kishan Kumar updated BEAM-3647: --- Description: *Requirement*-: Need to Run Template With Same Logics on Different Tables Data.(Example is Given Below) *Need*: Default Coder is Required So According to Data It Make All Fields as String and Read Data else Thier must be Dynamic Options to Read Coder From GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider. *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, SUB_PRICE) And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can Read Coder at Run Time The Same Template Can Be Used For Different Tables at Run Time. Such Situations Make Our Work Easy and Make Our job Easy. was: *Requirement*-: Need to Run Template With Same Logics on Different Tables Data.(Example is Given Below) *Need*: Default Coder is Required So According to Data It Make All Fields as String and Read Data else Thier must be Dynamic Options to Read Coder From GCS or SomeWhere Else so At Runtime Using ValueProvider or Something Else We Can Change Coder for Data Whose Query are Same For Common Fields But Data is Different *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, SUB_PRICE) And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can Read Coder at Run Time The Same Template Can Be Used For Different Tables at Run Time. Such Situations Make Our Work Easy and Make Our job Easy. > Default Coder/Reading Coder From File > -- > > Key: BEAM-3647 > URL: https://issues.apache.org/jira/browse/BEAM-3647 > Project: Beam > Issue Type: New Feature > Components: beam-model, dsl-sql >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Assignee: Kenneth Knowles >Priority: Critical > Labels: newbie > > *Requirement*-: Need to Run Template With Same Logics on Different Tables > Data.(Example is Given Below) > > *Need*: Default Coder is Required So According to Data It Make All Fields as > String and Read Data else Thier must be Dynamic Options to Read Coder From > GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location > Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider. > > > *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, > SUB_PRICE) > And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) > > On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can > Read Coder at Run Time The Same Template Can Be Used For Different Tables at > Run Time. > > Such Situations Make Our Work Easy and Make Our job Easy. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3647) Default Coder/Reading Coder From File
Kishan Kumar created BEAM-3647: -- Summary: Default Coder/Reading Coder From File Key: BEAM-3647 URL: https://issues.apache.org/jira/browse/BEAM-3647 Project: Beam Issue Type: New Feature Components: beam-model, dsl-sql Affects Versions: 2.2.0 Reporter: Kishan Kumar Assignee: Kenneth Knowles *Requirement*-: Need to Run Template With Same Logics on Different Tables Data.(Example is Given Below) *Need*: Default Coder is Required So According to Data It Make All Fields as String and Read Data else Thier must be Dynamic Options to Read Coder From GCS or SomeWhere Else so At Runtime Using ValueProvider or Something Else We Can Change Coder for Data Whose Query are Same For Common Fields But Data is Different *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, SUB_PRICE) And 2 Table is (NAME, ROLL, SUB, TEST_MARKS) On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can Read Coder at Run Time The Same Template Can Be Used For Different Tables at Run Time. Such Situations Make Our Work Easy and Make Our job Easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3615) Dynamic/Default Coder For Data
[ https://issues.apache.org/jira/browse/BEAM-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kishan Kumar updated BEAM-3615: --- Description: We need to Define Coder Statically in Every Job While Using Beam SQL Like in Previous Version 2.1.0 We Were Reading Avro Coder Files Statically But in 2.2.0 Now We can Read Dynamically Like that only Feature Needed to Introduced So We can Define Dynamically at Run Time Reading From GCS or Table(Table Schema) The Coder So Transformation can Be Performed. Which Makes Dataflow Template More Efficiend For Single Type of Job We can Use it Multiple Times Now We Need to Make Multiple Templates For Different Jobs. was: We need to Define Coder Statically in Every Job While Using Beam SQL Like in Previous Version 2.1.0 We Were Reading Avro Coder Files Statically Coder Needed to Be Defined Dynamically at Run Time Reading From GCS or Table So Transformation can Be Performed. > Dynamic/Default Coder For Data > -- > > Key: BEAM-3615 > URL: https://issues.apache.org/jira/browse/BEAM-3615 > Project: Beam > Issue Type: New Feature > Components: dsl-sql, sdk-java-extensions >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Assignee: Xu Mingmin >Priority: Critical > Labels: newbie, performance > > We need to Define Coder Statically in Every Job While Using Beam SQL Like in > Previous Version 2.1.0 We Were Reading Avro Coder Files Statically But in > 2.2.0 Now We can Read Dynamically Like that only Feature Needed to Introduced > So We can Define Dynamically at Run Time Reading From GCS or Table(Table > Schema) The Coder So Transformation can Be Performed. Which Makes Dataflow > Template More Efficiend For Single Type of Job We can Use it Multiple Times > Now We Need to Make Multiple Templates For Different Jobs. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3615) Dynamic/Default Coder For Data
Kishan Kumar created BEAM-3615: -- Summary: Dynamic/Default Coder For Data Key: BEAM-3615 URL: https://issues.apache.org/jira/browse/BEAM-3615 Project: Beam Issue Type: New Feature Components: dsl-sql, sdk-java-extensions Affects Versions: 2.2.0 Reporter: Kishan Kumar Assignee: Xu Mingmin We need to Define Coder Statically in Every Job While Using Beam SQL Like in Previous Version 2.1.0 We Were Reading Avro Coder Files Statically Coder Needed to Be Defined Dynamically at Run Time Reading From GCS or Table So Transformation can Be Performed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3509) PARTITION BY in Beam SQL In Select Command
[ https://issues.apache.org/jira/browse/BEAM-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kishan Kumar updated BEAM-3509: --- Fix Version/s: 2.3.0 > PARTITION BY in Beam SQL In Select Command > -- > > Key: BEAM-3509 > URL: https://issues.apache.org/jira/browse/BEAM-3509 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Affects Versions: 2.2.0 >Reporter: Kishan Kumar >Assignee: Xu Mingmin >Priority: Major > Labels: performance > Fix For: 2.3.0 > > > Partition By Option Will Be Very Help Full for DataFlow Developer To Migrate > Query and Do Transformation on That because of Many *Netezza Query and Oracle > Query* Consists Of Partition By Which Makes SQL Query More Efficient. *The > alternative is Making Joins And Filtering It Can Be Done But It Makes Code > Unreadable And Performance Become bad for DataFlow Job.* > Examples: SELECT MIN(COLUMN) OVER (PARTITION BY COLUMN NAME) FROM TABLENAME -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3509) PARTITION BY in Beam SQL In Select Command
Kishan Kumar created BEAM-3509: -- Summary: PARTITION BY in Beam SQL In Select Command Key: BEAM-3509 URL: https://issues.apache.org/jira/browse/BEAM-3509 Project: Beam Issue Type: New Feature Components: dsl-sql Affects Versions: 2.2.0 Reporter: Kishan Kumar Assignee: Xu Mingmin Partition By Option Will Be Very Help Full for DataFlow Developer To Migrate Query and Do Transformation on That because of Many *Netezza Query and Oracle Query* Consists Of Partition By Which Makes SQL Query More Efficient. *The alternative is Making Joins And Filtering It Can Be Done But It Makes Code Unreadable And Performance Become bad for DataFlow Job.* Examples: SELECT MIN(COLUMN) OVER (PARTITION BY COLUMN NAME) FROM TABLENAME -- This message was sent by Atlassian JIRA (v7.6.3#76005)