Unsubscribe me please

2012-05-14 Thread arul velusamy
Can someone please remove me from the mailing update list? Your help much
appreciated. Thanks.


[jira] [Commented] (NUTCH-1323) AjaxNormalizer

2012-05-14 Thread behnam nikbakht (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13274529#comment-13274529
 ] 

behnam nikbakht commented on NUTCH-1323:


yes , it's works correctly. thank you

 AjaxNormalizer
 --

 Key: NUTCH-1323
 URL: https://issues.apache.org/jira/browse/NUTCH-1323
 Project: Nutch
  Issue Type: New Feature
Reporter: Markus Jelsma
Assignee: Markus Jelsma
 Fix For: 1.6

 Attachments: NUTCH-1323-1.6-1.patch


 A two-way normalizer for Nutch able to deal with AJAX URL's, converting them 
 to _escaped_fragment_ URL's and back to an AJAX URL.
 https://developers.google.com/webmasters/ajax-crawling/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-1367) Port ParserChecker to Nutchgora

2012-05-14 Thread Ferdy Galema (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13274590#comment-13274590
 ] 

Ferdy Galema commented on NUTCH-1367:
-

Hey Lewis,

This tool is already present in Nutchgora.

 Port ParserChecker to Nutchgora
 ---

 Key: NUTCH-1367
 URL: https://issues.apache.org/jira/browse/NUTCH-1367
 Project: Nutch
  Issue Type: New Feature
  Components: parser
Affects Versions: nutchgora
Reporter: Lewis John McGibbney
 Fix For: 2.1


 This is such a great tool. It has come in handy so many times I would go blue 
 in the face if I had to try and count. e.g. for (int i = 0; i  infinity; i++)
 I think you get the idea.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (NUTCH-1366) speed up indexing by eliminating the indexreducer

2012-05-14 Thread Ferdy Galema (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdy Galema closed NUTCH-1366.
---

Resolution: Fixed

committed

 speed up indexing by eliminating the indexreducer
 -

 Key: NUTCH-1366
 URL: https://issues.apache.org/jira/browse/NUTCH-1366
 Project: Nutch
  Issue Type: Improvement
  Components: indexer
Reporter: Ferdy Galema
 Fix For: nutchgora

 Attachments: NUTCH-1366.patch


 Currently the indexer in Nutchgora consists of both mappers and reduces. But 
 the reduce code does not actually iterate over any (grouped/sorted) values. 
 It simply indexes individual key/value (String/Webpage) pairs. Therefore by 
 moving this indexing code to the mapper we can eliminate the reduce step 
 therefore making the indexing job much faster. (No more unnecessary spilling 
 to disk/network and no cpu wasted to sorting).
 Note this is not (directly) applicable to trunk because trunk uses a quite 
 different approach. Different types of input are combined to a single value 
 in the reducer. Although I think it is possible to implement a similar 
 optimization I am not sure how to do this. So if anyone wants this for trunk 
 too feel free to implement a similar patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-1366) speed up indexing by eliminating the indexreducer

2012-05-14 Thread Hudson (JIRA)














































Hudson
 commented on  NUTCH-1366


speed up indexing by eliminating the indexreducer















Integrated in Nutch-nutchgora #253 (See https://builds.apache.org/job/Nutch-nutchgora/253/)
NUTCH-1366 speed up indexing by eliminating the indexreducer (Revision 1338217)

 Result = SUCCESS
ferdy : 
Files : 

	/nutch/branches/nutchgora/CHANGES.txt
	/nutch/branches/nutchgora/src/java/org/apache/nutch/indexer/IndexUtil.java
	/nutch/branches/nutchgora/src/java/org/apache/nutch/indexer/IndexerJob.java
	/nutch/branches/nutchgora/src/java/org/apache/nutch/indexer/IndexerReducer.java





























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira