[
https://issues.apache.org/jira/browse/BEAM-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anton Kedin updated BEAM-3574:
------------------------------
Description:
Currently there are utility methods in BeamRecord to get field values by name,
e.g. BeamRecord.getFieldValue(String name). Internally they call
fieldNamesArrayList.indexOf(fieldName) to find the index of the field name.
This works as long as there is only one field with such name in the record. But
when joining 2 records you can end up with duplicate field nameswithout any
means of distinguishing them and getting a value from specific field by name.
We don't keep any metadata in BeamRecordType to help identify a field in this
case.
It feels that this can lead to obscure bugs.
We probably should keep more detailed schema information attached to the
fields, so that we could reference them using qualifiers like
"[schemaA].[pcollectionB].[fieldC]".
was:
Currently there are utility methods in BeamRecord to get field values by name,
e.g. BeamRecord.getFieldValue(String name). Internally they call
fieldNamesArrayList.indexOf(fieldName) to find the index of the field name.
This works as long as there is only one field with such name in the record. But
when joining 2 records you can end up with duplicate fields without any means
of distinguishing them and getting a value from specific field by name. We
don't keep any metadata in BeamRecordType to help identify a field in this
case.
It feels that this can lead to obscure bugs.
We probably should keep more detailed schema information attached to the
fields, so that we could reference them using qualifiers like
"[schemaA].[pcollectionB].[fieldC]".
> [SQL] Support schema qualifiers for field names
> -----------------------------------------------
>
> Key: BEAM-3574
> URL: https://issues.apache.org/jira/browse/BEAM-3574
> Project: Beam
> Issue Type: Bug
> Components: dsl-sql
> Reporter: Anton Kedin
> Priority: Major
>
> Currently there are utility methods in BeamRecord to get field values by
> name, e.g. BeamRecord.getFieldValue(String name). Internally they call
> fieldNamesArrayList.indexOf(fieldName) to find the index of the field name.
> This works as long as there is only one field with such name in the record.
> But when joining 2 records you can end up with duplicate field nameswithout
> any means of distinguishing them and getting a value from specific field by
> name. We don't keep any metadata in BeamRecordType to help identify a field
> in this case.
> It feels that this can lead to obscure bugs.
> We probably should keep more detailed schema information attached to the
> fields, so that we could reference them using qualifiers like
> "[schemaA].[pcollectionB].[fieldC]".
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)