You might consider using views too. Views all share the same
underlying table and each view may define different columns. They're
also updatable if your view where clause is a simple equality.
CREATE TABLE item (
type VARCHAR,
item_id VARCHAR,
price DECIMAL(10,2),
CONSTRAINT pk PRIMARY KEY (type, item_id));
CREATE VIEW apple(color VARCHAR)
AS SELECT * FROM ITEM WHERE type='Apple';
CREATE VIEW milk(quantity INTEGER)
AS SELECT * FROM ITEM WHERE type='Milk';
You could UPSERT into any of these and query across all of them
through the item table or a given type through the view.
On Tue, Dec 23, 2014 at 7:08 PM, Alex Kamil <[email protected]> wrote:
> I'll use parallel arrays for now, but STRUCT would be ideal
>
> it's a case for nested tables, each attribute can have multiple values, and
> several attributes are grouped into "sub-rows" by group id, e.g. in the
> table below the group <id: name, price> corresponds to <id1: "apple1",
> id1:1> that form one sub-row under rowkey1,and group <id2: "apple2" ,
> id2:2> form another sub-row with the same set of attributes
>
> another group is <id: order, supplier> , which has only one sub-row with two
> columns <id0: 1001, id0: "company1">
>
> in each row there are a few hundred groups with 1 to 10 attributes each
> (total up to 3000 columns in a row), the tables have 1M-100M rows, there are
> ~200 tables total.
> I would like to avoid normalization into additional tables as joins would
> slow things down
>
> | rowkey | id:order | id:name |id: price| id: supplier
>
> | rowkey1|id0: 1001 |id1: "apple1"| id1: 1 | id0: "company1"
> |id2: "apple2"| id2: 2 |
>
> | rowkey2|id3: 1002 |id4: "orange"| id4: 5 | id3: "company2"
>
> | rowkey3|id5: 1003 |id6: "pear1" | id6: 1 | id5: "company1"
> |id7: "pear2" | id7: 1 |
> |id8: "pear3" | id8: 3 |
>
> in scala i'd use this structure to represent it:
>
> case class ColumnLine(
> id: Int,
> value: Option[Any]
> )
> case class Column(
> colname: String,
> coltype: String,
> lines: Option[List[ColumnLine]]
> )
> case class Row (
> rowkey:String,
> columns:Map[String,Column] //colname -> Column
> )
> case class Table (
> name:String,
> rows:Map[String,Row] //rowkey -> Row
> )
>
>
> On Tue, Dec 23, 2014 at 9:06 PM, James Taylor <[email protected]>
> wrote:
>>
>> No, that's currently not possible. You'd may be able to leverage one
>> of the following to help you, though:
>> - parallel arrays as you've mentioned
>> - different tables with an FK (and likely an index) between them
>> - dynamic columns (http://phoenix.apache.org/dynamic_columns.html)
>> - on-the-fly updatable VIEW creation, where the VIEW represents the
>> set of tuples (http://phoenix.apache.org/views.html)
>>
>> The implementation of a STRUCT data type (PHOENIX-477) or support for
>> JSON (PHOENIX-628) may help you as well.
>>
>> Would it be possible to share more details about your use case?
>>
>> On Tue, Dec 23, 2014 at 2:07 PM, Alex Kamil <[email protected]> wrote:
>> > is there a way to represent Map.Entry<K,V>[] as a column value in
>> > phoenix,
>> > i.e. store an array of tuples <K,V> instead of creating two arrays: K
>> > VARCHAR ARRAY[] and V VARCHAR ARRAY[]
>
>