Is my configuration incorrect? any idea why the value, like fetchTime,
looks strange here?

On Fri, Apr 24, 2015 at 10:06 PM, Arthur Chan <[email protected]>
wrote:

> Hi,
>
> My Nutch is 2.3 with Gora and Hbase, below are the sample field values I
> have scanned from HBase here:
>
> baseUrl: value=http://www.apache.org
> status: value=\x00\x00\x00\x01
> prevFetchTime: value=\x00\x00\x01L\x91]\xF5\x1C
> fetchTime: value=\x00\x00\x01L\x93\x92\x0F\x5C
> fetchInterval: value=\x00'\x8D\x00
> retriesSinceFetch:  \x00\x00\x00\x00
> reprUrl:  value=http://www.apache.org
> protocolStatus: value=\x02\x00\x00
> modifiedTime: value=\x00\x00\x01L\x93@\xE1H
> prevModifiedTime: value=\x00\x00\x00\x00\x00\x00\x00\x00
> batchId: value=1428399528-1598360492
> parseStatus: value=\x02\x00\x00
> signature: value=\xD7\xA7\x04pT7?E\xFA\x1A\x01"\x08\x89$0
> prevSignature: value=\x85\xC2i@\xFC(\xDE\xEEt?\xE7\xFB\xE1rY\xAF
> score: value=\x00\x00\x00\x00
>
>
>
>
> Below is my related gora-hbase-mapping.xml about these fields
>
>         <field name="baseUrl" family="f" qualifier="bas"/>
>         <field name="status" family="f" qualifier="st"/>
>         <field name="prevFetchTime" family="f" qualifier="pts"/>
>         <field name="fetchTime" family="f" qualifier="ts"/>
>         <field name="fetchInterval" family="f" qualifier="fi"/>
>         <field name="retriesSinceFetch" family="f" qualifier="rsf"/>
>         <field name="reprUrl" family="f" qualifier="rpr"/>
>         <field name="content" family="f" qualifier="cnt"/>
>         <field name="contentType" family="f" qualifier="typ"/>
>         <field name="protocolStatus" family="f" qualifier="prot"/>
>         <field name="modifiedTime" family="f" qualifier="mod"/>
>         <field name="prevModifiedTime" family="f" qualifier="pmod"/>
>         <field name="batchId" family="f" qualifier="bid"/>
>         <field name="title" family="p" qualifier="t"/>
>         <field name="text" family="p" qualifier="c"/>
>         <field name="parseStatus" family="p" qualifier="st"/>
>         <field name="signature" family="p" qualifier="sig"/>
>         <field name="prevSignature" family="p" qualifier="psig"/>
>         <field name="score" family="s" qualifier="s"/>
>
>
>
> Q: Is there a way to configure Nutch/Gora/HBase so it will store the value
> like following and no need to do field type conversion?
>
> baseUrl:    null
> status: 4 (status_redir_temp)
> fetchTime:  1426888912463
> prevFetchTime:  1424296904936
> fetchInterval:  2592000
> retriesSinceFetch:  0
> modifiedTime:   0
> prevModifiedTime:   0
> protocolStatus: (null)
> parseStatus:    (null)
> title:  null
> score:  1.0
> marker _injmrk_ :   y
> marker dist :   0
> reprUrl:    null
> batchId:    1424296906-20007
>
>
> Please help!
>
> Regards
>
>

Reply via email to