[ 
https://issues.apache.org/jira/browse/NUTCH-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192600#comment-16192600
 ] 

ASF GitHub Bot commented on NUTCH-2435:
---------------------------------------

maborec commented on issue #225: NUTCH-2435 - New parameter "parser.store.text"
URL: https://github.com/apache/nutch/pull/225#issuecomment-334393715
 
 
   Hi @sebastian-nagel, 
   I can see you approved the changes some days ago.
   Do you plan to merge them into the repository? Is there anything missing 
from my side to proceed?
   Thanks!
   M
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> New configuration allowing to choose whether to store 'parse_text' directory 
> or not.
> ------------------------------------------------------------------------------------
>
>                 Key: NUTCH-2435
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2435
>             Project: Nutch
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.13
>         Environment: Apach Nutch 1.13
>            Reporter: Marcos Bori
>
> Whenever a page is parsed, one of the outputs is the directory 'parse_text'.
> It is intended to be used at the indexing phase so the page can be searched 
> from a search engine such as Solr.
> In my special crawling case, I don't need to index the page contents. 
> Therefore, creating and filing the 'parse_text' is not required for me. To 
> optimize performance, I don't want the crawler to store this information to 
> the filesystem. 
> I propose a new parameter "parser.store.text" allowing to choose whether to 
> store 'parse_text' directory or not. Its default value, of course, is "true".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to