[ 
https://issues.apache.org/jira/browse/SAMOA-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182116#comment-15182116
 ] 

ASF GitHub Bot commented on SAMOA-58:
-------------------------------------

Github user gdfm commented on the pull request:

    https://github.com/apache/incubator-samoa/pull/48#issuecomment-192876600
  
    Thanks for the patch @edi-bice !
    There are a few changes that go beyond the scope of the patch, and their 
goal is not entirely clear to me.
    For the rest, apart from some small doubts, the patch looks good.
    Please have a look at my comments, and see if they make sense.
    I'd like to get the patch in soon.


> Samoa AvroFileStream from HDFSFileStreamSource stops at end of first file
> -------------------------------------------------------------------------
>
>                 Key: SAMOA-58
>                 URL: https://issues.apache.org/jira/browse/SAMOA-58
>             Project: SAMOA
>          Issue Type: Bug
>          Components: SAMOA-Instances
>         Environment: RHEL 6.6, java 1.8.0_72
>            Reporter: Edi Bice
>
> It appears Samoa is capable of streaming a collection of files as a single 
> stream effectively concatenating the files. However using Samoa 
> AvroFileStream from HDFSFileStreamSource seems the stream stops at end of 
> first file:
> bin/samoa local target/SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar 
> "PrequentialEvaluation -i -1 -l (classifiers.ensemble.Bagging -s 100) -s 
> (AvroFileStream -s HDFSFileStreamSource -f 
> /tmp/order_and_feats_flat_avro/2016_02_18/ -c 1 -e binary) -f 10000"
> 2016-02-18 20:43:20,991 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:183) 
> - last event is received!
> 2016-02-18 20:43:20,991 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:184) 
> - total count: 262144
> ...
> 2016-02-18 20:43:20,993 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:191) 
> - total evaluation time: 34 seconds for 262144 instances
> bash-4.1$ hadoop fs -ls /tmp/order_and_feats_flat_avro/2016_02_18 | more
> Found 70 items
> -rw-r--r--   3 yarn hdfs  230855335 2016-02-18 16:01 
> /tmp/order_and_feats_flat_avro/2016_02_18/hdfs-1a238673-c4ec-4462-be67-78d573efa790-00001
> -rw-r--r--   3 yarn hdfs  229800273 2016-02-18 16:04 
> /tmp/order_and_feats_flat_avro/2016_02_18/hdfs-1a238673-c4ec-4462-be67-78d573efa790-00002
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to