[ 
https://issues.apache.org/jira/browse/MAHOUT-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012621#comment-13012621
 ] 

Dmitriy Lyubimov commented on MAHOUT-633:
-----------------------------------------

yes -- it is often the case -- esp. in lin algebra side info files -- that you 
need to load those blocks in a specific order. In this case order is determined 
by the task id. That's what my code is doing here.

It's just if you don't have support for some strategy determining file order 
then i can't use your multifile glob support. 

--
Thought No 2: I also had a thought that maybe it's worth to factor the problem 
out into two : 1 -- sequence file iterator and 2-- multifile glob iterator 
(supporting ordering as well) that delegates iterating to single file iterator. 
(that's usually how i did that sort of stuff). 

> Add SequenceFileIterable; put Iterable stuff in one place
> ---------------------------------------------------------
>
>                 Key: MAHOUT-633
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-633
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification, Clustering, Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>            Priority: Minor
>              Labels: iterable, iterator, sequence-file
>             Fix For: 0.5
>
>         Attachments: MAHOUT-633.patch, MAHOUT-633.patch, MAHOUT-633.patch
>
>
> In another project I have a useful little class, SequenceFileIterable, which 
> simplifies iterating over a sequence file. It's like FileLineIterable. I'd 
> like to add it, then use it throughout the code. See patch, which for now 
> merely has the proposed new classes. 
> Well it also moves some other iterator-related classes that seemed to be 
> outside their rightful home in common.iterator.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to