[
https://issues.apache.org/jira/browse/SPARK-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gen TANG updated SPARK-5090:
----------------------------
Description:
The python converter `HBaseResultToStringConverter` provided in the
HBaseConverter.scala returns only the value of first column in the result. It
limits the utility of this converter, because it returns only one value per
row(perhaps there are several version in hbase) and moreover it loses the other
information of record, such as column:cell, timestamp.
Here we would like to propose an improvement about python converter which
returns all the records in the results (in a single string) with more complete
information. We would like also make some improvements for hbase_inputformat.py
was:
The python converter `HBaseResultToStringConverter` provided in the
HBaseConverter.scala returns only the value of first column in the result. It
limits the utility of this converter, because it returns only one value per
row(perhaps there are several version in hbase) and moreover it loses the other
information of record, such as column:cell, timestamp.
Here we would like to propose an improvement about python converter which
returns all the records in the results (in a single string) with more complete
information.
> The improvement of python converter for hbase
> ---------------------------------------------
>
> Key: SPARK-5090
> URL: https://issues.apache.org/jira/browse/SPARK-5090
> Project: Spark
> Issue Type: Improvement
> Components: Examples
> Affects Versions: 1.2.0
> Reporter: Gen TANG
> Labels: hbase, python
> Fix For: 1.2.1
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The python converter `HBaseResultToStringConverter` provided in the
> HBaseConverter.scala returns only the value of first column in the result. It
> limits the utility of this converter, because it returns only one value per
> row(perhaps there are several version in hbase) and moreover it loses the
> other information of record, such as column:cell, timestamp.
> Here we would like to propose an improvement about python converter which
> returns all the records in the results (in a single string) with more
> complete information. We would like also make some improvements for
> hbase_inputformat.py
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]