We are using nutch 2.3 to extract data into elasticsearch (1.7) and using hbase 0.94.27. The system all works fine for text and the html is stored in hbase but we cannot extract it. We tried dump and a few options but nothing worked so far. Has anyone any ideas?
- Getting Whole HTML? Trevor Oakley
- RE: Getting Whole HTML? Markus Jelsma

