[ 
https://issues.apache.org/jira/browse/BEAM-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chie hayashida updated BEAM-10934:
----------------------------------
    Description: 
When I convert HCatRecord include Date type record to Row, it failed with the 
following errors.

* the code
```
    PCollection<Row> p =
        pipeline
            /*
             * Step #1: Read hive table rows from Hive.
             */
            .apply(
                "Read from Hive source",
                    HCatToRow.fromSpec(
                            HCatalogIO.read()
                                    .withConfigProperties(configProperties)
                                    .withDatabase(options.getHiveDatabaseName())
                                    .withTable(options.getHiveTableName())
                                    .withFilter(options.getFilterString())));
```

* error log
```
org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
java.lang.IllegalArgumentException: For field name submissiondate and DATETIME 
type got unexpected class class java.sql.Date
        at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
        at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
        at 
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
        at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
        at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
        at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
        at 
com.google.cloud.teleport.v2.templates.HiveToBigQuery.run(HiveToBigQuery.java:234)
        at 
com.google.cloud.teleport.v2.templates.HiveToBigQuery.main(HiveToBigQuery.java:176)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: For field name submissiondate 
and DATETIME type got unexpected class class java.sql.Date
        at org.apache.beam.sdk.values.Row$Builder.verifyDateTime(Row.java:828)
        at 
org.apache.beam.sdk.values.Row$Builder.verifyPrimitiveType(Row.java:755)
        at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:654)
        at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:635)
        at org.apache.beam.sdk.values.Row$Builder.build(Row.java:840)
        at 
org.apache.beam.sdk.io.hcatalog.HCatToRow$HCatToRowFn.processElement(HCatToRow.java:84)
```

It occurs because HCatalogIO reads Date type as java.sql.Date in HCatRecord, 
but Row class doesn't support Date and HCatToRow doesn't care about it.

I think there are two solution about it.

1. Row type supports Date type(java.util.Date or java.sql.Date)
   I don't know another IO classes enough, but there may be another IO classes 
which has same problem, and this solution may be able to solve those problem.

2. Add logic to convert Date type to Datetime type in HCatToRow
The impact of change will be smaller then 1. because it doesn't change Row 
class.

Which would be better?

  was:
When I convert HCatRecord include Date type record to Row, it failed with the 
following errors.

* the code
```
    PCollection<Row> p =
        pipeline
            /*
             * Step #1: Read hive table rows from Hive.
             */
            .apply(
                "Read from Hive source",
                    HCatToRow.fromSpec(
                            HCatalogIO.read()
                                    .withConfigProperties(configProperties)
                                    .withDatabase(options.getHiveDatabaseName())
                                    .withTable(options.getHiveTableName())
                                    .withFilter(options.getFilterString())));
```

* error log
```
org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
java.lang.IllegalArgumentException: For field name submissiondate and DATETIME 
type got unexpected class class java.sql.Date
        at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
        at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
        at 
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
        at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
        at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
        at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
        at 
com.google.cloud.teleport.v2.templates.HiveToBigQuery.run(HiveToBigQuery.java:234)
        at 
com.google.cloud.teleport.v2.templates.HiveToBigQuery.main(HiveToBigQuery.java:176)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: For field name submissiondate 
and DATETIME type got unexpected class class java.sql.Date
        at org.apache.beam.sdk.values.Row$Builder.verifyDateTime(Row.java:828)
        at 
org.apache.beam.sdk.values.Row$Builder.verifyPrimitiveType(Row.java:755)
        at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:654)
        at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:635)
        at org.apache.beam.sdk.values.Row$Builder.build(Row.java:840)
        at 
org.apache.beam.sdk.io.hcatalog.HCatToRow$HCatToRowFn.processElement(HCatToRow.java:84)
```

It occurs because HCatalogIO reads Date type as java.sql.Date in HCatRecord, 
but Row class doesn't support Date and HCatToRow doesn't care about it.

I think there are two solution about it.

1. Row type supports Date type(java.util.Date or java.sql.Date)
   I don't know another IO classes enough, but there may be another IO classes 
which has same problem, and this solution may be able to solve those problem.

2. Add logic to convert Date type to Datetime type in HCatToRow
The impact of change will be smaller then 1. because it doesn't change Row 
class.



> handling Date type when convert another class to Row class
> ----------------------------------------------------------
>
>                 Key: BEAM-10934
>                 URL: https://issues.apache.org/jira/browse/BEAM-10934
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-hcatalog, sdk-java-core
>            Reporter: chie hayashida
>            Priority: P2
>
> When I convert HCatRecord include Date type record to Row, it failed with the 
> following errors.
> * the code
> ```
>     PCollection<Row> p =
>         pipeline
>             /*
>              * Step #1: Read hive table rows from Hive.
>              */
>             .apply(
>                 "Read from Hive source",
>                     HCatToRow.fromSpec(
>                             HCatalogIO.read()
>                                     .withConfigProperties(configProperties)
>                                     
> .withDatabase(options.getHiveDatabaseName())
>                                     .withTable(options.getHiveTableName())
>                                     .withFilter(options.getFilterString())));
> ```
> * error log
> ```
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.IllegalArgumentException: For field name submissiondate and 
> DATETIME type got unexpected class class java.sql.Date
>         at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
>         at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
>         at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
>         at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
>         at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>         at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
>         at 
> com.google.cloud.teleport.v2.templates.HiveToBigQuery.run(HiveToBigQuery.java:234)
>         at 
> com.google.cloud.teleport.v2.templates.HiveToBigQuery.main(HiveToBigQuery.java:176)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: For field name submissiondate 
> and DATETIME type got unexpected class class java.sql.Date
>         at org.apache.beam.sdk.values.Row$Builder.verifyDateTime(Row.java:828)
>         at 
> org.apache.beam.sdk.values.Row$Builder.verifyPrimitiveType(Row.java:755)
>         at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:654)
>         at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:635)
>         at org.apache.beam.sdk.values.Row$Builder.build(Row.java:840)
>         at 
> org.apache.beam.sdk.io.hcatalog.HCatToRow$HCatToRowFn.processElement(HCatToRow.java:84)
> ```
> It occurs because HCatalogIO reads Date type as java.sql.Date in HCatRecord, 
> but Row class doesn't support Date and HCatToRow doesn't care about it.
> I think there are two solution about it.
> 1. Row type supports Date type(java.util.Date or java.sql.Date)
>    I don't know another IO classes enough, but there may be another IO 
> classes which has same problem, and this solution may be able to solve those 
> problem.
> 2. Add logic to convert Date type to Datetime type in HCatToRow
> The impact of change will be smaller then 1. because it doesn't change Row 
> class.
> Which would be better?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to