There should be proper flow control policies which considers applications' needs.
Best, Young-Seok On Sun, May 17, 2015 at 8:25 PM, Chen Li <[email protected]> wrote: > A general question is: if the speed of incoming records is higher than > the merge and disk-flush speed, what should we do? Can we reject the > insertion requests? > > Chen > > On Sun, May 17, 2015 at 5:38 PM, Michael Carey <[email protected]> > wrote: > > Moving this to the incubator dev list (just looking it over again). Q - > if > > when a merge finishes there's a bigger backlog of components - will it > > currently consider doing a more-ways merge? (Instead of 5, if there are > 13 > > sitting there when the 5-merge finishes - will a 13-merge be initiated?) > > Just curious. We do probably need to think about some sort of better > flow > > control here - the storage gate should presumably slow down admissions > if it > > can't keep up - have to ponder what that might mean. (I have a better > idea > > of what it could mean for feeds than for normal inserts.) One could > argue > > that an increasing backlog is a sign that we should be scaling out the > > number of partitions for the dataset (future work but important work > :-)). > > > > Cheers, > > Mike > > > > On 4/15/15 2:33 PM, [email protected] wrote: > > > > Status: New > > Owner: ---- > > Labels: Type-Defect Priority-Medium > > > > New issue 868 by [email protected]: About prefix merge policy behavior > > https://code.google.com/p/asterixdb/issues/detail?id=868 > > > > I describe how current prefix merge policy works based on the observation > > from ingestion experiments. > > Also, the similar observation was observed by Sattam as well. > > The observed behavior seems a bit unexpected, so I post the observation > here > > to consider better merge policy and/or better lsm index design regarding > > merge operations. > > > > The aqls used for the experiment are shown at the end of this writing. > > > > Prefix merge policy decides to merge disk components based on the > following > > conditions > > 1. Look at the candidate components for merging in oldest-first order. > If > > one exists, identify the prefix of the sequence of all such components > for > > which the sum of their sizes exceeds MaxMergableComponentSize. Schedule > a > > merge of those components into a new component. > > 2. If a merge from 1 doesn't happen, see if the set of candidate > components > > for merging exceeds MaxToleranceComponentCnt. If so, schedule a merge > all > > of the current candidates into a new single component. > > Also, the prefix merge policy doesn't allow concurrent merge operations > for > > a single index partition. > > In other words, if there is a scheduled or an on-going merge operation, > even > > if the above conditions are met, the merge operation is not scheduled. > > > > Based on this merge policy, the following situation can occur. > > Suppose MaxToleranceCompCnt = 5 and 5 disk components were flushed to > disk. > > When 5th disk component is flushed, the prefix merge policy schedules a > > merge operation to merge the 5 components. > > During the merge operation is scheduled and starts merging, concurrently > > ingested records generates more disk components. > > As long as a merge operation is not fast enough to catch up the speed of > > generating 5 disk components by incoming ingested records, > > the number of disk components increases as time goes. > > So, the slower merge operations are, the more disk components there will > be > > as time goes. > > > > I also attached a result of a command, "ls -alR <directory of the > asterixdb > > instance for an ingestion experiment>" which was executed after the > > ingestion is over. > > The attached file shows that for primary index (whose directory is > > FsqCheckinTweet_idx_FsqCheckinTweet), ingestion generated 20 disk > > components, where each disk component consists of btree (the filename has > > suffix _b) and bloom filter (the filename has suffix_f) and > > MaxMergableComponentSize is set to 1GB. > > It also shows that for the secondary index (whose directory is > > FsqCheckinTweet_idx_sifCheckinCoordinate), ingestion generated more than > > 1400 components, where each disk component consist of a dictionary btree > > (suffix: _b), an inverted list (suffix: _i), a deleted-key btree (suffix: > > _d), and a bloom filter for the deleted-key btree (suffix: _f). > > Even if the ingestion was over, since our merge operation happens > > asynchronously, the merge operation continues and eventually merge all > > mergable disk components according to the describe merge policy. > > > > ------------------------------------------ > > AQLs for the ingestion experiment > > ------------------------------------------ > > drop dataverse STBench if exists; > > create dataverse STBench; > > use dataverse STBench; > > > > create type FsqCheckinTweetType as closed { > > id: int64, > > user_id: int64, > > user_followers_count: int64, > > text: string, > > datetime: datetime, > > coordinates: point, > > url: string? > > } > > create dataset FsqCheckinTweet (FsqCheckinTweetType) primary key id > > > > /* this index type is only available kisskys/hilbertbtree branch. > however, > > you can easily replace sif index to inverted keyword index on the text > field > > and you will see similar behavior */ > > create index sifCoordinate on FsqCheckinTweet(coordinates) type > sif(-180.0, > > -90.0, 180.0, 90.0); > > > > /* create feed */ > > create feed TweetFeed > > using file_feed > > (("fs"="localfs"), > > ("path"="127.0.0.1 > :////Users/kisskys/Data/SynFsqCheckinTweet.adm"),("format"="adm"),("type-name"="FsqCheckinTweetType"),("tuple-interval"="0")); > > > > /* connect feed */ > > use dataverse STBench; > > set wait-for-completion-feed "true"; > > connect feed TweetFeed to dataset FsqCheckinTweet; > > > > > > > > > > Attachments: > > storage-layout.txt 574 KB > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "asterixdb-dev" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to [email protected]. > > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "asterixdb-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. >
