[ https://issues.apache.org/jira/browse/MAHOUT-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Schelter resolved MAHOUT-1243. ---------------------------------------- Resolution: Fixed Added new option "seqDictOut" that trigger writing of the dictionary in SequenceFileFormat > Dictionary file format in Lucene-Mahout integration is not in > SequenceFileFormat > -------------------------------------------------------------------------------- > > Key: MAHOUT-1243 > URL: https://issues.apache.org/jira/browse/MAHOUT-1243 > Project: Mahout > Issue Type: Bug > Components: Integration > Affects Versions: 0.7 > Reporter: Suneel Marthi > Assignee: Suneel Marthi > Fix For: 0.8 > > > Dictionary file format generated from lucene.vectors is not in > SequenceFileFormat and hence not acceptable as input to CVB clustering. > The problem code from Driver.java > {Code} > File dictOutFile = new File(dictOut); > log.info("Dictionary Output file: {}", dictOutFile); > Writer writer = Files.newWriter(dictOutFile, Charsets.UTF_8); > DelimitedTermInfoWriter tiWriter = new DelimitedTermInfoWriter(writer, > delimiter, field); > try { > tiWriter.write(termInfo); > } finally { > Closeables.close(tiWriter, false); > } > {Code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira