I'll use parallel arrays for now, but STRUCT would be ideal

it's a case for nested tables, each attribute can have multiple values, and
several attributes are grouped into "sub-rows" by group id, e.g. in the
table below the group <*id: name, price>* corresponds to * <id1: "apple1",
 id1:1>* that form one sub-row under rowkey1,and group <*id2: "apple2" ,
 id2:2> *form another sub-row with the same set of attributes

another group is <*id: order, supplier>* , which has only one sub-row with
two columns *<id0: 1001, id0: "company1">*

in each row there are a few hundred groups with 1 to 10 attributes each
(total up to 3000 columns in a row), the tables have 1M-100M rows, there
are ~200 tables total.
I would like to avoid normalization into additional tables as joins would
slow things down

| rowkey |  id:order   |   id:name  |id: price| id: supplier

| rowkey1|id0: 1001 |id1: "apple1"| id1: 1 | id0: "company1"
                                |id2: "apple2"| id2: 2 |

| rowkey2|id3: 1002  |id4: "orange"| id4: 5 | id3: "company2"

| rowkey3|id5: 1003  |id6: "pear1" | id6: 1 | id5: "company1"
                                 |id7: "pear2" | id7: 1 |
                                 |id8: "pear3" | id8: 3 |

in scala i'd use this structure to represent it:

case class ColumnLine(
  id: Int,
  value: Option[Any])case class Column(
  colname: String,
  coltype: String,
  lines: Option[List[ColumnLine]])case class Row (
  rowkey:String,
  columns:Map[String,Column] //colname -> Column)case class Table (
  name:String,
  rows:Map[String,Row] //rowkey -> Row)


On Tue, Dec 23, 2014 at 9:06 PM, James Taylor <[email protected]>
wrote:

> No, that's currently not possible. You'd may be able to leverage one
> of the following to help you, though:
> - parallel arrays as you've mentioned
> - different tables with an FK (and likely an index) between them
> - dynamic columns (http://phoenix.apache.org/dynamic_columns.html)
> - on-the-fly updatable VIEW creation, where the VIEW represents the
> set of tuples (http://phoenix.apache.org/views.html)
>
> The implementation of a STRUCT data type (PHOENIX-477) or support for
> JSON (PHOENIX-628) may help you as well.
>
> Would it be possible to share more details about your use case?
>
> On Tue, Dec 23, 2014 at 2:07 PM, Alex Kamil <[email protected]> wrote:
> > is there a way to represent Map.Entry<K,V>[] as a column value in
> phoenix,
> > i.e. store an array of tuples <K,V> instead of creating two arrays: K
> > VARCHAR ARRAY[] and V VARCHAR ARRAY[]
>

Reply via email to