Is my configuration incorrect? any idea why the value, like fetchTime, looks strange here?
On Fri, Apr 24, 2015 at 10:06 PM, Arthur Chan <[email protected]> wrote: > Hi, > > My Nutch is 2.3 with Gora and Hbase, below are the sample field values I > have scanned from HBase here: > > baseUrl: value=http://www.apache.org > status: value=\x00\x00\x00\x01 > prevFetchTime: value=\x00\x00\x01L\x91]\xF5\x1C > fetchTime: value=\x00\x00\x01L\x93\x92\x0F\x5C > fetchInterval: value=\x00'\x8D\x00 > retriesSinceFetch: \x00\x00\x00\x00 > reprUrl: value=http://www.apache.org > protocolStatus: value=\x02\x00\x00 > modifiedTime: value=\x00\x00\x01L\x93@\xE1H > prevModifiedTime: value=\x00\x00\x00\x00\x00\x00\x00\x00 > batchId: value=1428399528-1598360492 > parseStatus: value=\x02\x00\x00 > signature: value=\xD7\xA7\x04pT7?E\xFA\x1A\x01"\x08\x89$0 > prevSignature: value=\x85\xC2i@\xFC(\xDE\xEEt?\xE7\xFB\xE1rY\xAF > score: value=\x00\x00\x00\x00 > > > > > Below is my related gora-hbase-mapping.xml about these fields > > <field name="baseUrl" family="f" qualifier="bas"/> > <field name="status" family="f" qualifier="st"/> > <field name="prevFetchTime" family="f" qualifier="pts"/> > <field name="fetchTime" family="f" qualifier="ts"/> > <field name="fetchInterval" family="f" qualifier="fi"/> > <field name="retriesSinceFetch" family="f" qualifier="rsf"/> > <field name="reprUrl" family="f" qualifier="rpr"/> > <field name="content" family="f" qualifier="cnt"/> > <field name="contentType" family="f" qualifier="typ"/> > <field name="protocolStatus" family="f" qualifier="prot"/> > <field name="modifiedTime" family="f" qualifier="mod"/> > <field name="prevModifiedTime" family="f" qualifier="pmod"/> > <field name="batchId" family="f" qualifier="bid"/> > <field name="title" family="p" qualifier="t"/> > <field name="text" family="p" qualifier="c"/> > <field name="parseStatus" family="p" qualifier="st"/> > <field name="signature" family="p" qualifier="sig"/> > <field name="prevSignature" family="p" qualifier="psig"/> > <field name="score" family="s" qualifier="s"/> > > > > Q: Is there a way to configure Nutch/Gora/HBase so it will store the value > like following and no need to do field type conversion? > > baseUrl: null > status: 4 (status_redir_temp) > fetchTime: 1426888912463 > prevFetchTime: 1424296904936 > fetchInterval: 2592000 > retriesSinceFetch: 0 > modifiedTime: 0 > prevModifiedTime: 0 > protocolStatus: (null) > parseStatus: (null) > title: null > score: 1.0 > marker _injmrk_ : y > marker dist : 0 > reprUrl: null > batchId: 1424296906-20007 > > > Please help! > > Regards > >

