[
https://issues.apache.org/jira/browse/PIG-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466649#comment-13466649
]
Bill Graham commented on PIG-2341:
----------------------------------
Jayesh, this patch is great. Thanks for taking this on. Just a few comments:
- The various options should be listed in the Terms table, instead of in usage.
- Try to talk about what it does, as opposed to what it can do. For example,
"HBaseStorage can store and load data from HBase" should be "HBaseStorage
stores and loads data from HBase"
- When describing the various params, describe what the param does, as opposed
to how it does it. For example, "This specifies to the HBase scan method to
read rows greater than minKeyVal" should be "Specifies only rows with a rowKey
greater than minKeyVal are to be returned".
- "and a wildcard as a suffix" should be "followed by an asterisk (\*)". "using
the column family name and a wildcard" should be "using the column family name
and an asterisk (i.e., cf:*)"
- "Columns from multiple column families are specified by seperating each
column family and column qualifier pair by a single space." should be "Columns
from multiple column families can be returned." No need to specify the space
delimiter, since you already have.
- Likewise, the last two sentance of Usage can be omitted, since you mention
above that not all columns must be specified.
- Should specify that loadKey is false by default as well as how it inserts an
extra field as the first element fo the schema, before the columns specified.
- There are a few more options to describe (see the constructor javadocs in the
code on the trunk): delim, ignoreWhitespace, noWAL, minTimestamp, maxTimestamp,
timestamp. Note that the "extreme caution" warning in the javadoc is
mis-located. It should apply to the noWAL option.
- We should add some discussion about STORE and how the first field needs to be
the rowKey, as well as how maps and scalars are handled. See the Javadoc of the
class for a description of this.
Also, after you upload the next patch (typically named something like
PIG-2341_2.patch) you'll want to set the "patch available" flag, which alerts
folks that it's ready for review.
> Need better documentation on Pig/HBase integration
> --------------------------------------------------
>
> Key: PIG-2341
> URL: https://issues.apache.org/jira/browse/PIG-2341
> Project: Pig
> Issue Type: Sub-task
> Components: documentation
> Affects Versions: 0.9.0
> Reporter: Mikael Sitruk
> Labels: hbase
> Attachments: PIG-2341.patch
>
>
> One of the nice thing between Pig and Hbase is that they can be integrated.
> Thanks to recent patch (PIG-1250) committed.
> The documentation is not well updated yet (currently almost relate to the
> patch itself). It world be nice to document this feature in detail in the Pig
> documentation page (e.g, in here:
> http://pig.apache.org/docs/r0.9.1/func.html#load-store-functions).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira