Hi,
i read complete storm document ! but am not understanding the following
things !please help me !.1) we have concept of stream grouping , am not getting
any difference practically with one to one. in document , shuffle grouping
mens tuples are equally distributed to all tasks i.e if spout emits 10 tuples
only i have bolt 5 tasks then finally bolt receives 50 tuples .field grouping
means all same tuples will go to same task ,am unable to prove practically with
all groupings.please help me in stream grouping.2)for processing tuples in
bolt fast i used 10 executors with field grouping , i had a problem here if
spout emits tuples 10 then am not receive same tuples in bolt duplicates are
coming but when i use global grouping it is fine but slow in cluster
mode.3)what are the minimum basic properties for topology run fast ?4) i
performed reading list of files using spout using java ,when i use multiple
executors for reading fast already processed records are coming at that
scenario i handled with java is it possible in storm? am unable to perform
(reading from files and processing ) 10000 records (including 10 files)
aggregations successfully ?5)in trident topology i did aggregations with 10
files(each file 1000 records) fine ,but when i use 10 files with 100000
records (each file 10000 records) my application is keep processing nothing is
done i mean control is not coming to corresponding filter or function , it
will take lot of time to emit never comes to filter or function .(here i used
irichspout)example code;-main app code Config con = new Config();
con.setDebug(true); con.put("fileLocation",args[0]);
con.put("ext",args[1]); con.setNumAckers(10);
file=args[2]; //con.setNumWorkers(Integer.parseInt(args[3]));
System.out.println("application start time :"+new Date());
TridentTopology topology = new TridentTopology(); Stream
s=topology.newStream("spout1", new ReadingSpout1(9080000)).parallelismHint(10);
s.groupBy(new Fields("m")). aggregate(new
Fields("v"), new Sum(), new Fields("r")).each(new Fields("m", "r"), new
MyFun1(file), new Fields("o")).parallelismHint(40);
LocalCluster cluster=new LocalCluster();
cluster.submitTopology("TD", con,topology.build()); 6) with out trident state
tuples which are failed are never replay?7)when i run trident with specified
number of workers also it is not running ,please help me any configuration i
missed? 8)i have a requirement to perform streaming by reading list of files in
a specified locations(large number of files) i need to aggregate based on
considering sliding window operations and write result into some files or any
destination !which way is better either normal topology or trident ? please
help me!