---------- Forwarded message ---------- From: shen jimmy <[email protected]> Date: 2015-06-07 0:19 GMT+08:00 Subject: Re: InputFormat on samoa-storm To: Gianmarco De Francisci Morales <[email protected]>
Hi gdfm, Thx for your help.. for my first question, i mean that what if a text file doesn't begin with the symbols as @attribute, @data, etc. need i add them into my text file, assuming data in my text file shows as *1,1,1,2,3,..,2* (right now another question hit my head: how can i know which column is the label, the first one, the last one, or any other) here comes my second question. as you can see below, it's a configure file of my storm cluster, known as *storm.yaml* βand the file * samoa-storm.properties* *β* as we know, covtypeNorm.arff have data of 580, 000+ lines, and for the first time i run the topology, it seemed going well as below β then the error occurred, and the worker of spout seemed to restart β that troubled me a lot ! TT Besides, i ran the topology as this command bin/samoa storm target/<my compiled samoa jar for storm> "PrequentialEvaluation -l classifiers.ensemble.Bagging -s (ArffFileStream -f /home/covtypeNorm.arff) -f 100, 000 -d /tmp/dump.csv" what's more, there is indeed a dump.csv file in the node centos3, but nothing in it actually. that all my questions. thanks and sorry again for your time . best wishes 2015-06-05 20:36 GMT+08:00 Gianmarco De Francisci Morales <[email protected]>: > Hi Shen, > > Arff files are text files (fancy csv with a header). > What do you mean with "process text files"? You need a way to convert a > text file into structured data, arff is one way to do that. > > Which version of Storm are you using? > We have run experiments on Storm clusters, and they usually work, however > there might be configuration issues. > > Cheers, > -- > Gianmarco > > On 3 June 2015 at 16:21, Gianmarco De Francisci Morales <[email protected]> > wrote: > >> Forwarding to the SAMOA mailing list. >> -- >> Gianmarco >> >> On 3 June 2015 at 05:55, shen jimmy <[email protected]> wrote: >> >>> Hi gdfm, >>> >>> i'm new to samoa, i got 2 questions and i thought i need your help very >>> much. >>> >>> The first one is, what i can find online only .arff files were tested on >>> samoa-storm. My question is, how to process text files with samoa-storm? Or >>> should i convert text files into .arff files? >>> >>> The second is that i tested learner bagging and arffstreamgenerator for >>> covtypeNorm.arff as what u did online, when i set mode as local in the >>> samoa-storm.properties, it worked well and i could see a dump.csv file >>> since i added argument "-d <path to dump file>/dump.csv". But when i set >>> mode as cluster and submit it to my storm cluster(3 virtual machines nodes, >>> centos6.5), even though i can monitored the topology on storm ui, the it >>> didn't work well. Instead, it seemed running and reset and running again, >>> sometimes i can see the file dump.csv but only few log information in it. >>> >>> Could u help me or give me some advice, i'll be appreciated. thanks >>> >> >> >
