[
https://issues.apache.org/jira/browse/MAHOUT-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060250#comment-14060250
]
Suneel Marthi commented on MAHOUT-1319:
---------------------------------------
Wenjun, the difference in seqdirectory between Mahout 0.7 and 0.9 is that in
0.7 seqdirectory was running in sequential mode only; while in 0.9 it runs in
MR mode by default and is expecting to read input files from HDFS (which
doesn't seem to exist in ur case). If u r trying to run 0.9 seqdirectory try
the following :-
mahout seqdirectory -i 20news-bydate-train/ -o 20news-bydate-train-seq -ow -xm
sequential
> seqdirectory -filter argument silently ignored when run as MR
> -------------------------------------------------------------
>
> Key: MAHOUT-1319
> URL: https://issues.apache.org/jira/browse/MAHOUT-1319
> Project: Mahout
> Issue Type: Bug
> Components: Integration
> Affects Versions: 0.8
> Reporter: Liz Merkhofer
> Assignee: Suneel Marthi
> Labels: seqdirectory, text
> Fix For: 0.9
>
> Attachments: MAHOUT-1319-custom-filter.patch, MAHOUT-1319.patch
>
>
> Running "seqdirectory" (Sequence Files from Input Directory) from the command
> line and specifying a custom filter using the -filter parameter, the argument
> is ignored and the default "PrefixAdditionFilter" is used on the input. No
> exception is thrown.
> When the same command is run with "-xm sequential", the filter is found and
> works as expected.
--
This message was sent by Atlassian JIRA
(v6.2#6252)