them and store in one of the fields of webpage storage
class or this step is not needed?
Thanks.
Alex.
-Original Message-
From: Lewis John Mcgibbney lewis.mcgibb...@gmail.com
To: user user@nutch.apache.org
Sent: Tue, Jul 3, 2012 5:08 am
Subject: Re: parse and solrindex in nutch
lewis.mcgibb...@gmail.com
To: user user@nutch.apache.org
Sent: Tue, Jul 3, 2012 5:08 am
Subject: Re: parse and solrindex in nutch-2.0
Hi,
On Mon, Jul 2, 2012 at 8:21 PM, alx...@aim.com wrote:
Regarding the metadata, what would be a proper way of parsing end indexing
multivalued tags in nutch
am
Subject: Re: parse and solrindex in nutch-2.0
Hi,
Correct. When using specific_batchid or -all you have to run the
updaterjob first. (Because it checks the dbupdate mark to not be null). But
a workaround is to simply run the indexer with -reindex. This will ignore
the db update mark and tries
Hi,
Correct. When using specific_batchid or -all you have to run the
updaterjob first. (Because it checks the dbupdate mark to not be null). But
a workaround is to simply run the indexer with -reindex. This will ignore
the db update mark and tries to index every parsed row (at any time).
About
update (or whatever the actual name of the command is) after parsing?
On 25 June 2012 22:35, alx...@aim.com wrote:
Hello,
I have tested nutch-2.0 with hbase and mysql trying to index only one url
with depth 1.
I tried to fetch an html tag value and parse it to metadata column in
webpage
Hello,
I have tested nutch-2.0 with hbase and mysql trying to index only one url with
depth 1.
I tried to fetch an html tag value and parse it to metadata column in webpage
object by adding parse-tag plugin. I saw there is no metadata member variable
in Parse class, so I used putToMetadata
6 matches
Mail list logo