[ 
https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doğacan Güney updated NUTCH-650:
--------------------------------

    Attachment: hbase_v2.patch

New patch. Contains some fixes and:

- Support page modification detection in nutchbase (store previous signature 
and fetch time as distinct columns until we get support for scanning multiple 
versions)
- A new PluggableHbase interface for nutchbase plugins
- Converted HtmlParseFilters for nutchbase
- Index cache-policies in index-basichbase.
- Added no-caching support to parse-htmlhbase.
- Added support for content encoding auto detection to nutchbase
- Do not instantiate a new MimeUtil for _every_ content
- Added support for (Http-)headers


> Hbase Integration
> -----------------
>
>                 Key: NUTCH-650
>                 URL: https://issues.apache.org/jira/browse/NUTCH-650
>             Project: Nutch
>          Issue Type: New Feature
>    Affects Versions: 1.0.0
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>         Attachments: hbase-integration_v1.patch, hbase_v2.patch
>
>
> This issue will track nutch/hbase integration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to