Hi comunity. Im using nutch 1.9 and solr 4.10. I use nutch for parse zip documents, but the field language is empty in solr for all of this documents and this is a problem for me. ParseZip plugin use tika to detect mimetype and to extract content of files but language is missing. I was thinking that if the package has 3 documents so the language could be a multivalued field and contain all language from the documents inside. What you think about this topic?
- about language extraction for zip documents Eyeris RodrIguez Rueda
- Re: about language extraction for zip docume... Lewis John Mcgibbney
- Re: about language extraction for zip do... Mattmann, Chris A (3980)

