GitHub user PeterCiuffetti opened a pull request:
https://github.com/apache/nutch/pull/44
Nutch 2058 - New index-replace plugin that allows regexp field value
replacements
Modifies the NutchDocument during the IndexingFilter phase to do regexp
replacements on specified fields.
See https://issues.apache.org/jira/browse/NUTCH-2058
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/PeterCiuffetti/nutch NUTCH-2058
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nutch/pull/44.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #44
----
commit dc32ce6dd66b4e712b1e9693a4e726febbc8171e
Author: PeterCiuffetti <[email protected]>
Date: 2015-07-01T13:31:03Z
Initial checkin got parse-replace
commit 2eebd285232bd0595bf321add1d35ae1a60e7d07
Author: PeterCiuffetti <[email protected]>
Date: 2015-07-01T13:31:11Z
Merge branch 'trunk' of github.com:apache/nutch into parse-replace
commit a2c1851e096bfd528b722778671490d4fd610a4b
Author: PeterCiuffetti <[email protected]>
Date: 2015-07-02T14:27:19Z
Refactored from a parse filter to an index filter
commit 57748e0de2e7fc60d349462144c3ed7703ac0957
Author: PeterCiuffetti <[email protected]>
Date: 2015-07-04T09:22:02Z
Updated tests. Feature set complete
commit e80e7b1e59a0025a1e5ed266e06546e97b7c2770
Author: PeterCiuffetti <[email protected]>
Date: 2015-07-04T09:23:23Z
Merge branch 'trunk' of github.com:apache/nutch into NUTCH-2058
commit 81368fe08193a365a6ca6f2179eb46e96ef0f7c5
Author: PeterCiuffetti <[email protected]>
Date: 2015-07-04T09:34:18Z
README doc change
commit d2d534c1a9a48dd7a29147453f4c4e1fc78f11fb
Author: PeterCiuffetti <[email protected]>
Date: 2015-07-04T10:17:27Z
Updated documentation
commit 0455d9119b694ccb9274a43dba392b76771a9da1
Author: PeterCiuffetti <[email protected]>
Date: 2015-07-04T10:23:19Z
Undoing build.xml change
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---