Hi, 
i read  complete storm document !  but am  not understanding the following 
things !please help me !.1) we have concept of stream grouping , am not getting 
any difference practically  with one to one. in document , shuffle grouping 
mens  tuples are equally distributed to all  tasks i.e if spout emits 10 tuples 
only i have bolt 5 tasks then finally bolt receives 50 tuples .field grouping  
means all same tuples will go to same task ,am unable to prove practically with 
all groupings.please help me in stream grouping.2)for processing   tuples in 
bolt fast i used 10 executors  with field grouping , i had a problem here if 
spout emits tuples 10 then am not receive same tuples in bolt duplicates are 
coming but when i use global grouping it is fine but slow in cluster 
mode.3)what are  the minimum basic properties for topology run fast  ?4)   i 
performed  reading list of files using spout using java ,when i use multiple 
executors for reading fast already processed records are coming at that 
scenario i  handled with java is it possible in storm? am unable to perform 
(reading from files and processing ) 10000 records (including 10 files) 
aggregations  successfully ?5)in trident topology i did  aggregations with 10 
files(each file 1000 records)  fine ,but when i use 10 files with 100000 
records (each file 10000 records) my application is keep processing nothing is 
done  i mean control is not coming to corresponding filter  or function , it 
will take lot of time to emit never comes to filter or function .(here i used 
irichspout)example code;-main app code         Config con = new Config();       
 con.setDebug(true);        con.put("fileLocation",args[0]);        
con.put("ext",args[1]);        con.setNumAckers(10);               
file=args[2];        //con.setNumWorkers(Integer.parseInt(args[3]));        
System.out.println("application start time :"+new Date());        
TridentTopology topology = new TridentTopology();        Stream 
s=topology.newStream("spout1", new ReadingSpout1(9080000)).parallelismHint(10); 
               s.groupBy(new Fields("m")).                aggregate(new 
Fields("v"), new Sum(), new Fields("r")).each(new Fields("m", "r"), new 
MyFun1(file), new Fields("o")).parallelismHint(40);                        
LocalCluster cluster=new LocalCluster();                
cluster.submitTopology("TD", con,topology.build()); 6) with out trident state 
tuples which are failed are never replay?7)when i run trident with specified  
number of workers also it is not running ,please help me any configuration i 
missed? 8)i have a requirement to perform streaming by reading list of files in 
a specified locations(large number of files) i need to aggregate based on 
considering sliding window operations and write result into some files or any 
destination !which way is better either normal topology or trident ?  please 
help me!                                           

Reply via email to