[ 
https://issues.apache.org/jira/browse/PHOENIX-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723581#comment-16723581
 ] 

Chinmay Kulkarni commented on PHOENIX-374:
------------------------------------------

[~tdsilva] A couple of points:
 * When upserting data for dynamic columns, we will need to embed the data type 
of the dynamic column. We can achieve this using a scan attribute which stores 
info about dynamic columns and can be resolved on the server-side in a 
RegionObserver coprocessor in the _doPostScannerOpen_ method.
 * When selecting wildcards or CF wildcards, we currently set the column family 
of the scan, however that also sets the "columns" that can be iterated over in 
the scan. Based on a config (defaulting to false), we can either set or not set 
column families for the scanner in case of wildcard queries. On top of this, we 
would need to stop projecting columns in the ResultSet and add APIs in the 
PhoenixResultSet in order to figure out:  number of dynamic columns, data types 
of each dynamic column, getValue for the dynamic column (this last one is 
basically a combination of getting the data type and then coercing ResultSet 
value to be of that type such as getInt, getBoolean, etc.).

Let me know what you think.

> Enable access to dynamic columns in * or cf.* selection
> -------------------------------------------------------
>
>                 Key: PHOENIX-374
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-374
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: nicolas maillard
>            Assignee: Chinmay Kulkarni
>            Priority: Critical
>
> As of recent work we can now read and write columns that are not in the 
> schema, AKA dynamic columns. the Select and Upsert allow dynamic columns to 
> be specified. 
> I think two additions are still needed.
> - Alter dynamicly: In the Upsert and/or Select statement  the ability to add 
> on the specified dynamic column to schema. Say Upsert into Table (key, 
> cf.dynColumn varchar SCHEMAADD) values (..)
> and for select: 
>      - select key, cf.dynColumn varchar from T would only read
>      - select key from T(cf.dynColumn varchar ) would only read and wrtie to 
> schema
> - Select a complete column Family: More complex, accessing a whole Column 
> Family with all rows known in schema or not.
>  select cf.* from T
> today this works for know columns it could be nice to have this for all 
> columns of a family in the schema or not. I'm trying right now to extend this 
> to schema for unknown columns. However every new row can a lot of very 
> different unknowcolumns. The defined ones will be first but the unknown one 
> will be appended at the end.
> This means the metadata might need to be updated at every row to account for 
> all new columns discovered.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to