[jira] [Commented] (HBASE-10091) Exposing HBase DataTypes to non-Java interfaces
[ https://issues.apache.org/jira/browse/HBASE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956775#comment-13956775 ] Nick Dimiduk commented on HBASE-10091: -- Looks like [~navis] has been thinking about how to specify a composite as well. > Exposing HBase DataTypes to non-Java interfaces > --- > > Key: HBASE-10091 > URL: https://issues.apache.org/jira/browse/HBASE-10091 > Project: HBase > Issue Type: Sub-task > Components: Client >Reporter: Nick Dimiduk > > Access to the DataType implementations introduced in HBASE-8693 is currently > limited to consumers of the Java API. It is not easy to specify a data type > in non-Java environments, such as the HBase shell, REST or Thrift Gateways, > command-line arguments to our utility MapReduce jobs, or in integration > points such as a (hypothetical extension to) Hive's HBaseStorageHandler. See > examples where this limitation impedes in HBASE-8593 and HBASE-10071. > I propose the implementation of a type definition DSL, similar to the > language defined for Filters in HBASE-4176. By implementing this in core > HBase, it can be reused in all of the situations described previously. The > parser for this DSL must support arbitrary type extensions, just as the > Filter parser allows for new Filter types to be registered at runtime. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10091) Exposing HBase DataTypes to non-Java interfaces
[ https://issues.apache.org/jira/browse/HBASE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955877#comment-13955877 ] Nick Dimiduk commented on HBASE-10091: -- I haven't worked through a prototype yet, so I don't know exactly. The DSL we have for exposing filters is parsed once, in Java (using [ParseFilter|https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ParseFilter.html]), by the shell or Thrift service (I guess REST service doesn't support this yet). The user would provide the type mapping as a configuration string and let whatever is interacting with the HTable handle sending provided data literals to the correct DataType instances. One example consumer is the Hive metastore. The table is defined in metastore that has a column mapping, similar to today, mapping the metastore table column to an HBase table column. In addition to the column mapping, a type specification is also provided. This would be an Expression in the DSL we're discussing. The StorageHandler would be responsible for honoring this additional component in the mapping. How exactly we ensure the metastore type can be converted to/from the HBase {{DataType}} is still up for question. I hope to learn from Phoenix on this, hence I deferred that work out to HBASE-8863. More concretely, I imagine this DSL is relatively simple. A complete type definition might be as simple as {{package.class\[/ORDER\]}}. We'll need to add any necessary API to {{DataType}} to support constructing from the parser. There may also be some built-in named definitions, "raw" or "ordered-bytes", where we ship an existing known mapping between Java type and HBase DataType implementation. This would be a convenience for consumers of HTable; I don't know how this would play into a metastore implementation. The only place where potential overlap with Avro/Protobuf comes in is with [Struct|http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/types/Struct.html]. I'm not convinced this is very complicated either; just a sequence of types with syntax for specifying an optional element. There's no concept of "schema versioning" in {{Struct}}; there's no room for it in a place where encoded ordering is the primary concern. > Exposing HBase DataTypes to non-Java interfaces > --- > > Key: HBASE-10091 > URL: https://issues.apache.org/jira/browse/HBASE-10091 > Project: HBase > Issue Type: Sub-task > Components: Client >Reporter: Nick Dimiduk > > Access to the DataType implementations introduced in HBASE-8693 is currently > limited to consumers of the Java API. It is not easy to specify a data type > in non-Java environments, such as the HBase shell, REST or Thrift Gateways, > command-line arguments to our utility MapReduce jobs, or in integration > points such as a (hypothetical extension to) Hive's HBaseStorageHandler. See > examples where this limitation impedes in HBASE-8593 and HBASE-10071. > I propose the implementation of a type definition DSL, similar to the > language defined for Filters in HBASE-4176. By implementing this in core > HBase, it can be reused in all of the situations described previously. The > parser for this DSL must support arbitrary type extensions, just as the > Filter parser allows for new Filter types to be registered at runtime. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10091) Exposing HBase DataTypes to non-Java interfaces
[ https://issues.apache.org/jira/browse/HBASE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13934275#comment-13934275 ] stack commented on HBASE-10091: --- Tell me more about how this would work? hbase:int would map to org.apache.hadoop.hbase.types.RawInt in the DSL Each language would have to have an interpreter for the DSL? There is some overlap with how types are called out in avro/pb IDLs? > Exposing HBase DataTypes to non-Java interfaces > --- > > Key: HBASE-10091 > URL: https://issues.apache.org/jira/browse/HBASE-10091 > Project: HBase > Issue Type: Sub-task > Components: Client >Reporter: Nick Dimiduk > > Access to the DataType implementations introduced in HBASE-8693 is currently > limited to consumers of the Java API. It is not easy to specify a data type > in non-Java environments, such as the HBase shell, REST or Thrift Gateways, > command-line arguments to our utility MapReduce jobs, or in integration > points such as a (hypothetical extension to) Hive's HBaseStorageHandler. See > examples where this limitation impedes in HBASE-8593 and HBASE-10071. > I propose the implementation of a type definition DSL, similar to the > language defined for Filters in HBASE-4176. By implementing this in core > HBase, it can be reused in all of the situations described previously. The > parser for this DSL must support arbitrary type extensions, just as the > Filter parser allows for new Filter types to be registered at runtime. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10091) Exposing HBase DataTypes to non-Java interfaces
[ https://issues.apache.org/jira/browse/HBASE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840689#comment-13840689 ] Nick Dimiduk commented on HBASE-10091: -- Such a parser should probably allow registering aliases for a more complete type definition in order to make interactive experiences (such as HBASE-10071) more palatable. Once could establish an alias for the session, say {{long => OrderedInt64#DESCENDING}}, {{decimal => OrderedNumeric}}, and {{my_struct => some composite Struct declaration}}. > Exposing HBase DataTypes to non-Java interfaces > --- > > Key: HBASE-10091 > URL: https://issues.apache.org/jira/browse/HBASE-10091 > Project: HBase > Issue Type: Sub-task > Components: Client >Reporter: Nick Dimiduk > > Access to the DataType implementations introduced in HBASE-8693 is currently > limited to consumers of the Java API. It is not easy to specify a data type > in non-Java environments, such as the HBase shell, REST or Thrift Gateways, > command-line arguments to our utility MapReduce jobs, or in integration > points such as a (hypothetical extension to) Hive's HBaseStorageHandler. See > examples where this limitation impedes in HBASE-8593 and HBASE-10071. > I propose the implementation of a type definition DSL, similar to the > language defined for Filters in HBASE-4176. By implementing this in core > HBase, it can be reused in all of the situations described previously. The > parser for this DSL must support arbitrary type extensions, just as the > Filter parser allows for new Filter types to be registered at runtime. -- This message was sent by Atlassian JIRA (v6.1#6144)