[jira] [Commented] (PHOENIX-66) Support array creation from CSV file

James Taylor (JIRA) Mon, 10 Mar 2014 18:12:27 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13929826#comment-13929826
 ]


James Taylor commented on PHOENIX-66:
-------------------------------------

Patch looks good, [~gabriel.reid]. Thanks so much. A couple of minor items:
- Can you use the standard JDBC APIs for instantiating the array instead of the 
internal PArrayDataType methods? See ArrayTest for examples.
- Do you have an error check for the array delimiter being the same as the 
field delimiter, as this would causing issues, no?
- Can you use our tab/spacing conventions and compiler settings (see Eclipse 
prefs in phoenix/dev dir)?
- We've gotten away from normalizing column names automatically, as it causes 
problems if folks use case sensitive names. Would you mind updating that?
- Do all tests pass?
- Can the patch be applied to 3.0, 4.0, and master (as we have three branches 
now)? If not, would you mind attaching separate patches for the different 
branches?

[~jeffreyz] - are you ok with this change? [~mujtaba] and 
[~james.viole...@ds-iq.com] - any feedback on this?

> Support array creation from CSV file
> ------------------------------------
>
>                 Key: PHOENIX-66
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-66
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>             Fix For: 3.0.0
>
>         Attachments: PHOENIX-66-intermediate.patch, PHOENIX-66.patch
>
>
> We should support being able to parse an array defined in our CVS file. 
> Perhaps something like this:
> a, b, c, [foo, 1, bar], d
> We'd know (from the data type of the column), that we have an array for the 
> fourth field here.
> One option to support this would be to implement the 
> PDataType.toObject(String) for the ARRAY PDataType enums. That's not ideal, 
> though, as we'd introduce a dependency from PDataType to our CSVLoader, since 
> we'd need to in turn parse each element. Also, we don't have a way to pass 
> through the custom delimiters that might be in use.
> Another pretty trivial, though a bit more constrained approach would be to 
> look at the column ARRAY_SIZE to control how many of the next CSV columns 
> should be used as array elements. In this approach, you wouldn't use the 
> square brackets at all. You can get the ARRAY_SIZE from the column metadata 
> through connection.getMetaData().getColumns() call, through 
> resultSet.getInt("ARRAY_SIZE"); However, the ARRAY_SIZE is optional in a DDL 
> statement, so we'd need to do something for the case where it's not specified.
> A third option would be to handle most of the parsing in the CSVLoader. We 
> could use the above bracket syntax, and then collect up the next set of CSV 
> field elements until we hit the unescaped ']'. Then we'd use our standard 
> JDBC APIs to build the array and continue on our merry way.
> What do you think, [~jviolettedsiq]? Or [~bruno], maybe you can take a crack 
> at it?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PHOENIX-66) Support array creation from CSV file

Reply via email to