Ah sorry, I misunderstood. Buffering to memory and writing to HDFS will be faster. By writing to disk, you reduce a probability of losing that data by making it bit slower.
However, if you are running two receivers, probability you will loose data is less anyway. So I guess buffer in memory and writing to HDFS would be OK. --Srinath On Fri, Nov 7, 2014 at 8:24 AM, Anjana Fernando <[email protected]> wrote: > Hi Srinath, > > I think that example is a bit flawed :) .. I didn't mean to compare > Cassandra with the HDFS case here, I know Cassandra is far more complicated > than the HDFS operations, where the data operations in HDFS is very simple, > and I've a feeling, that with that much small events, it may have turned > into an CPU bound operation rather than I/O bound, because of the > processing required for each event (maybe their batch impl. is crappy), > that maybe why even the bigger batch is also slow. OS level buffers you > said, yeah, so they efficiently batch the physical disk writes, in the > memory, and flush it out later. But that's a different thing, here, we are > just writing to the disk and reading it back again, so as I see, we are > just using the local disk as a buffer, where we could just do this in the > RAM. Basically, build up sizable chunks in memory, and write to HDFS. So we > lose the, even though comparably little, overhead of writing and reading to > the local disk, where still, the bottleneck would be writing the data out > of the network, to a remote server's disk somewhere. Simply put, this > direct HDFS operation should be able to saturate the network link we have, > even if we can't, we can ask ourself, how can writing it to the local disk > and reading it again, optimize it more. > > Cheers, > Anjana. > > On Thu, Nov 6, 2014 at 6:15 PM, Srinath Perera <[email protected]> wrote: > >> Of course we need to try it out and verify, I am just making a case that >> we should try it out :) >> >> Also, RDBMS should be default as most scenarios can be handled with DBs >> and those is no reason to make everyone's life complicated. >> >> --Srinath >> >> On Fri, Nov 7, 2014 at 7:44 AM, Srinath Perera <[email protected]> wrote: >> >>> 1) Anjana you assuming the bandwidth is the bottleneck. Let me give an >>> example. >>> >>> With sequential reads and writes, a HDD can do > 100MB/sec and 1G >>> network can do > 50 MB/sec >>> But BAM best number we have seen is about 40k event/sec (that with 4 >>> machines or so, lets assume one machine). Lets assume 20 bytes events. Then >>> it will be doing <1MB/sec. >>> >>> Problem is Cassandra break data to lot of small operations losing OS >>> level buffer to buffer transfers files transfers can do. I have tried >>> increasing batch size for cassandra, which help with smaller batches. But >>> after about few thousand operations in the same batch, things start get >>> much slower. >>> >>> Best numbers will come when we run two receivers instead of NFS. >>> >>> 2) Frank, this is analytics data. So it is read only and most cases we >>> need only time based queries with less resolution (15min smallest >>> resolution is fine for most case). This to say run this batch query on last >>> hour of data so on. >>> >>> However, we have some scenarios where we do Adhoc queries for things >>> like activity monitoring. Those would not work for those and we will have >>> to run a batch job to push that data to RDBMS or Solar etc. Anjana, we need >>> to discuss this. >>> >>> But also there are lot of usecases to receive and write the event to >>> disk as soon as possible and later run MapReduce on top them. For those >>> above will work. >>> >>> --Srinath >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Fri, Nov 7, 2014 at 7:23 AM, Anjana Fernando <[email protected]> wrote: >>> >>>> Hi Sanjiva, >>>> >>>> On Thu, Nov 6, 2014 at 4:01 PM, Sanjiva Weerawarana <[email protected]> >>>> wrote: >>>> >>>>> Anjana I think the idea was for the file system -> HDFS upload to >>>>> happen via a simple cron job type thing. >>>>> >>>> >>>> Even so, we will be just moving the problem to another area, the >>>> overall effort done by that hardware is still the same (writing to disk, >>>> reading it back, write it to network). That is, even though we can goto >>>> very a high throughput initially by writing it to the local disk at first, >>>> later on we have to read it back and write it to HDFS via the network, >>>> which is the slower part of our operation. So if we continue to load the >>>> machine with an extreme throughput, you will eventually lose space in that >>>> disk. >>>> >>>> Cheers, >>>> Anjana. >>>> >>>> >>>>> >>>>> On Wed, Nov 5, 2014 at 9:19 AM, Anjana Fernando <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Srinath, >>>>>> >>>>>> Wouldn't it better, if we just make the batch size bigger, that is, >>>>>> lets just have a sizable local in-memory store, something probably close >>>>>> to >>>>>> 64MB, which is the default HDFS block size, and only after this is >>>>>> filled, >>>>>> or if the receiver is idle maybe, we can flush the buffer. I was just >>>>>> thinking, writing to the file system first itself will be expensive, >>>>>> where >>>>>> there are additional steps of writing all the records to the local file >>>>>> system and again reading it back, and then finally writing it to HDFS, >>>>>> and >>>>>> of course, again having a network file system would be an overhead, and >>>>>> not >>>>>> to mention the implementation/configuration complications that will come >>>>>> with this. IMHO, we should try to make these scenarios as simple as >>>>>> possible. >>>>>> >>>>>> I'm doing our new BAM data layer implementations here [1], where I'm >>>>>> almost done with an RDBMS implementation, doing some refactoring now >>>>>> (mail >>>>>> on this yet to come :)), I can also do an HDFS one after that and check >>>>>> it. >>>>>> >>>>>> [1] >>>>>> https://github.com/wso2/carbon-analytics/tree/master/components/xanalytics >>>>>> >>>>>> Cheers, >>>>>> Anjana. >>>>>> >>>>>> On Tue, Nov 4, 2014 at 6:56 PM, Srinath Perera <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> Following came out of chat with Sanjiva on a scenario involve very >>>>>>> large number of events coming into BAM. >>>>>>> >>>>>>> Currently we use Cassandra to store the events and number we got out >>>>>>> of it has not been great and Cassandra need too much attention to get to >>>>>>> those number. >>>>>>> >>>>>>> With Cassandra (or any DB) we write data as records. We can batch >>>>>>> it, but still amount of data in one IO operation is small. In >>>>>>> comparison, >>>>>>> file transfers are much much faster and that is fastest way to get some >>>>>>> data from A to B. >>>>>>> >>>>>>> So I am proposing to write the events that comes into a local file >>>>>>> in the Data Receiver, and periodically append them to a HDFS file. We >>>>>>> can >>>>>>> arrange data in a folder by stream and files by timestamp (e.g. 1h data >>>>>>> go >>>>>>> to a new file), so we can selectively pull and process data using Hive. >>>>>>> (We >>>>>>> can use something like https://github.com/OpenHFT/Chronicle-Queue >>>>>>> to write data to disk). >>>>>>> >>>>>>> If user needs avoid losing any messages at all in case of a disk >>>>>>> failure, either he can have a SAN or NTFS or can run two replicas of >>>>>>> receivers (we should write some code so only one of the receivers will >>>>>>> actually put data to HDFS). >>>>>>> >>>>>>> Coding wise, this should not be too hard. I am sure this will be >>>>>>> factor of time faster than Cassandra (of course we need to do a PoC and >>>>>>> verify). >>>>>>> >>>>>>> WDYT? >>>>>>> >>>>>>> --Srinath >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> ============================ >>>>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >>>>>>> Site: http://people.apache.org/~hemapani/ >>>>>>> Photos: http://www.flickr.com/photos/hemapani/ >>>>>>> Phone: 0772360902 >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Anjana Fernando* >>>>>> Senior Technical Lead >>>>>> WSO2 Inc. | http://wso2.com >>>>>> lean . enterprise . middleware >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Sanjiva Weerawarana, Ph.D. >>>>> Founder, Chairman & CEO; WSO2, Inc.; http://wso2.com/ >>>>> email: [email protected]; office: (+1 650 745 4499 | +94 11 214 5345) >>>>> x5700; cell: +94 77 787 6880 | +1 408 466 5099; voip: +1 650 265 8311 >>>>> blog: http://sanjiva.weerawarana.org/; twitter: @sanjiva >>>>> Lean . Enterprise . Middleware >>>>> >>>> >>>> >>>> >>>> -- >>>> *Anjana Fernando* >>>> Senior Technical Lead >>>> WSO2 Inc. | http://wso2.com >>>> lean . enterprise . middleware >>>> >>> >>> >>> >>> -- >>> ============================ >>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >>> Site: http://people.apache.org/~hemapani/ >>> Photos: http://www.flickr.com/photos/hemapani/ >>> Phone: 0772360902 >>> >> >> >> >> -- >> ============================ >> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >> Site: http://people.apache.org/~hemapani/ >> Photos: http://www.flickr.com/photos/hemapani/ >> Phone: 0772360902 >> > > > > -- > *Anjana Fernando* > Senior Technical Lead > WSO2 Inc. | http://wso2.com > lean . enterprise . middleware > -- ============================ Blog: http://srinathsview.blogspot.com twitter:@srinath_perera Site: http://people.apache.org/~hemapani/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
