GitHub user prernasatija opened a pull request:
https://github.com/apache/nutch/pull/57
2.x
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/nutch 2.x
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nutch/pull/57.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #57
----
commit f7ef04dca1b763e86502a3b23064520ded39181e
Author: Ferdy Galema <[email protected]>
Date: 2012-08-31T12:49:26Z
NUTCH-1462 Elasticsearch not indexing when type==null in NutchDocument
metadata
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1379431
13f79535-47bb-0310-9956-ffa450edef68
commit 1bb03c759180688f58284189abca787437935647
Author: Ferdy Galema <[email protected]>
Date: 2012-08-31T12:56:41Z
NUTCH-1463 Elasticsearch indexer should wait and check response for last
flush
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1379435
13f79535-47bb-0310-9956-ffa450edef68
commit c5e2236f36a881ee7fec97aff3baf9bb32b40200
Author: Ferdy Galema <[email protected]>
Date: 2012-08-31T13:02:32Z
NUTCH-1448 Redirected urls should be handled more cleanly (more like an
outlink url)
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1379438
13f79535-47bb-0310-9956-ffa450edef68
commit 33de245d3211d2be19559870c5a821381e18e9c0
Author: Ferdy Galema <[email protected]>
Date: 2012-08-31T15:57:18Z
NUTCH-1431 Introduce link 'distance' and add configurable max distance in
the generator
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1379488
13f79535-47bb-0310-9956-ffa450edef68
commit c1b68c35ee02d1588786d5767f3feaa71b5393e1
Author: Ferdy Galema <[email protected]>
Date: 2012-09-07T08:17:58Z
NUTCH-1459 Remove dead code (phase2) from InjectorJob
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1381931
13f79535-47bb-0310-9956-ffa450edef68
commit e878515c26e1bceaed2555a3cac2402322f27046
Author: Ferdy Galema <[email protected]>
Date: 2012-09-07T14:19:47Z
NUTCH-1456 Updater not setting batchId in markers correctly. (Alexander
Kingson via ferdy)
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1382037
13f79535-47bb-0310-9956-ffa450edef68
commit 32b825c58bcb1647bec548cb1ea17ee4ae522399
Author: Lewis John McGibbney <[email protected]>
Date: 2012-09-15T16:16:48Z
NUTCH-1162 Write JUnit tests for parse-js
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1385103
13f79535-47bb-0310-9956-ffa450edef68
commit 4369dac176a228d0c9ef729dca89bcff0e097211
Author: Lewis John McGibbney <[email protected]>
Date: 2012-09-15T23:06:34Z
NUTCH-1470 Ensure test files are included for runtime testing
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1385199
13f79535-47bb-0310-9956-ffa450edef68
commit ecb86f4de0209c73e5b00fa0df8d4c6f58c592bf
Author: Ferdy Galema <[email protected]>
Date: 2012-09-17T09:24:33Z
NUTCH-1468 Redirects that are external links not adhering to
db.ignore.external.links
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1386526
13f79535-47bb-0310-9956-ffa450edef68
commit 068636631cc73786b150e1ec2cd0be38919890e7
Author: Lewis John McGibbney <[email protected]>
Date: 2012-09-18T14:07:57Z
NUTCH-1162 test file
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1387173
13f79535-47bb-0310-9956-ffa450edef68
commit 19e694e609776a388ce1409a3272a2a15b101222
Author: Lewis John McGibbney <[email protected]>
Date: 2012-09-18T14:13:26Z
add keyspace reference to NullPointerException on inject before
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1387175
13f79535-47bb-0310-9956-ffa450edef68
commit 590ad02aea95c1dcb9c6ad25de1e38a815c7fa82
Author: Lewis John McGibbney <[email protected]>
Date: 2012-09-18T20:30:25Z
NUTCH-1432 property storage.schema does not work anymore, should be
storage.schema.webpage and storage.schema.host
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1387347
13f79535-47bb-0310-9956-ffa450edef68
commit fceecfabb9c47952f0ec2b3fcd2a6241dbedb465
Author: Sebastian Nagel <[email protected]>
Date: 2012-09-18T20:52:08Z
NUTCH-1415 release packages to contain top level folder apache-nutch-x.x
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1387356
13f79535-47bb-0310-9956-ffa450edef68
commit 2da30f3d398a53da6fcc85f143e8b2d0b1c75837
Author: Lewis John McGibbney <[email protected]>
Date: 2012-09-21T14:37:07Z
revert gora-cassandra to v0.2, prepare for 2.2 development
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1388529
13f79535-47bb-0310-9956-ffa450edef68
commit bc7ef2e9c62606c5f134d5e1ad8ea001d90dbd36
Author: Sebastian Nagel <[email protected]>
Date: 2012-10-10T21:05:19Z
NUTCH-706 Url regex normalizer: pattern for session id removal not to match
"newsId"
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1396795
13f79535-47bb-0310-9956-ffa450edef68
commit 2e31b117aa7e25193bcdeabce4088f71c91a7029
Author: Sebastian Nagel <[email protected]>
Date: 2012-10-10T21:15:55Z
NUTCH-1344 BasicURLNormalizer to normalize https same as http
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1396800
13f79535-47bb-0310-9956-ffa450edef68
commit 8b35d734a5112af93f571aab218e190a225990dd
Author: Sebastian Nagel <[email protected]>
Date: 2012-10-10T21:58:06Z
NUTCH-706 (applied correct patch)
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1396822
13f79535-47bb-0310-9956-ffa450edef68
commit f9d0e7685d7f43cc8f1bbbd37d73fe2d9ddc4461
Author: Lewis John McGibbney <[email protected]>
Date: 2012-10-10T23:02:57Z
NUTCH-874 Make sure all plugins in src/plugin are compatible with Nutch 2.0
and Gora (part 1)
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1396850
13f79535-47bb-0310-9956-ffa450edef68
commit 33e7ae5a7ed524939e91f887de7c9821deb8a866
Author: Julien Nioche <[email protected]>
Date: 2012-10-20T08:49:53Z
NUTCH-1087 crawl script
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1400390
13f79535-47bb-0310-9956-ffa450edef68
commit 39893c6e5681e6936572f5d9983ab1decd085bf5
Author: Julien Nioche <[email protected]>
Date: 2012-10-20T09:14:40Z
NUTCH-1433 Upgrade to Tika 1.2
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1400397
13f79535-47bb-0310-9956-ffa450edef68
commit 244ebf6682c3ea5969a2f36ab72e0fa2fceead31
Author: Sebastian Nagel <[email protected]>
Date: 2012-10-23T20:47:16Z
NUTCH-1344 BasicURLNormalizer to normalize https same as http - forgot to
add committer
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1401458
13f79535-47bb-0310-9956-ffa450edef68
commit 0cffa912513dcdd6526ae4189f7207f23c903b49
Author: Sebastian Nagel <[email protected]>
Date: 2012-10-23T20:52:21Z
NUTCH-1421 RegexURLNormalizer to only skip rules with invalid patterns
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1401460
13f79535-47bb-0310-9956-ffa450edef68
commit a722e43d2c5a6225d46b2178174def4918a6b4d4
Author: Markus Jelsma <[email protected]>
Date: 2012-11-06T09:17:38Z
NUTCH-1491 Strip UTF-8 non-character codepoints in title
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1406077
13f79535-47bb-0310-9956-ffa450edef68
commit c7342c74b52a0fc2ee6c070299f997f673584013
Author: Lewis John McGibbney <[email protected]>
Date: 2012-11-07T18:47:54Z
NUTCH-1493 Error adding field 'contentLength'='' during solrindex using
index-more
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1406749
13f79535-47bb-0310-9956-ffa450edef68
commit e9b46e9088e48c45a4086b983117ebaf3e202e30
Author: Lewis John McGibbney <[email protected]>
Date: 2012-11-09T16:35:50Z
* NUTCH-1488 bin/nutch to run junit from any directory (snagel via lewismc)
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1407531
13f79535-47bb-0310-9956-ffa450edef68
commit f35d6ab520701be0fd345be5b577eba73ecee9e4
Author: Lewis John McGibbney <[email protected]>
Date: 2012-11-12T12:53:27Z
NUTCH-1496 ParserJob logs skipped urls with level info
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1408271
13f79535-47bb-0310-9956-ffa450edef68
commit 37c31a62c488ef0d9b248f1be8e930db29ba38ed
Author: Lewis John McGibbney <[email protected]>
Date: 2012-11-12T13:56:30Z
NUTCH-1451 Upgrade automaton jar to 1.11-8
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1408289
13f79535-47bb-0310-9956-ffa450edef68
commit 0d350bc0f6e9468b7560de443230425550099550
Author: Sebastian Nagel <[email protected]>
Date: 2012-11-12T21:20:55Z
NUTCH-1484 TableUtil unreverseURL fails on file:// URLs
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1408465
13f79535-47bb-0310-9956-ffa450edef68
commit 1873f6eb3e8c2c5d6b5a55dff1304397c66dcbe9
Author: Lewis John McGibbney <[email protected]>
Date: 2012-11-22T14:45:07Z
NUTCH-1370 Expose exact number of urls injected @runtime
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1412566
13f79535-47bb-0310-9956-ffa450edef68
commit 3a1effa22216236e8989aed39a4b7bc3cb0b1f9c
Author: Lewis John McGibbney <[email protected]>
Date: 2012-11-22T14:51:28Z
NUTCH-1370 Expose exact number of urls injected @runtime
git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1412570
13f79535-47bb-0310-9956-ffa450edef68
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---