Ah, that explains it I guess. This Block indexing of all records of a row should be an option. It will have big costs for online indexing.
Lets take the case of gmail itself. A user will have hundreds-of-thousands of e-mails and every day 10-15 mails at different time intervals, will be added to the corpus Scattering records across segments and taking a minor hit during search, will be the preferred choice right? As a compensation, we can use a SortingMergePolicy as documented at https://issues.apache.org/jira/browse/LUCENE-4752 We can co-locate all records of a given row during merge across participating segments. This will offset the performance loss to a good extent What do you think? -- Ravi On Fri, Oct 11, 2013 at 6:12 AM, Aaron McCurry <[email protected]> wrote: > On Thu, Oct 10, 2013 at 6:47 AM, Ravikumar Govindarajan < > [email protected]> wrote: > > > I saw this JIRA on humungous rows and got quite confused on the > UPDATE_ROW > > operation. > > > > https://issues.apache.org/jira/browse/BLUR-220 > > > > Lets say I add 2 records to a row, whose existing records number in > > hundreds-of-thousands. > > > > Will Blur attempt to first read all these records before adding the > > incoming 2 records? > > > > It has to right now. > > > > > > What, if we just expose simple record-add/delete on a row, without > fetching > > the row at all? > > > > The problem is that the internal query class is built to only support > records (documents) that are indexed together as a single block, within a > single segment. It is very performant for reads and searches, but as the > row grows in size it becomes very costly. > > One idea I had was to detect when rows are hot (being updated a lot) or > they are too large and move them into there own indexes. For the hot rows, > once they cool off they could be merged back in with the regular rows in > the main index. > > > > > > It should be quite quick and highly useful, at least for apps already > using > > lucene. > > > > Agreed, that's what that issue is meant to solve. > > > > > > -- > > Ravi > > > > > > On Wed, Oct 9, 2013 at 11:27 AM, Ravikumar Govindarajan < > > [email protected]> wrote: > > > > > Yes, I think bringing in a mutable file in lucene-index brings it's own > > > set of problems to handle. Filters, Caches, Scoring, Snapshots/Commits > > > etc... will all be affected. > > > > > > There is on JIRA on writing generation of updatable files, just like > > > doc-deletes instead of over-writing a single file.[ > > > https://issues.apache.org/jira/browse/LUCENE-4258]. But that is still > > > in-progress and from what I understand, it could slow searches > > considerably. > > > > > > BTW, is it possible to extend BlurPartitioner and load it during > > start-up? > > > > > > Also, it would be awesome if Blur supports a per-row auto-complete > > feature. > > > > > > -- > > > Ravi > > > > > > > > > On Sat, Oct 5, 2013 at 2:01 AM, Aaron McCurry <[email protected]> > > wrote: > > > > > >> I have thought of one possible problem with this approach. To date > the > > >> mindset I have used in all of the Blur internals is that segments are > > >> immutable. This is a fundamental principle that Blur uses and I don't > > >> really have any ideas on where to behind checking for when this is a > > >> problem. I know filters are going to be an issue, not sure where > else. > > >> > > >> Not saying that it can't be done, it's just not going to be as clean > as > > I > > >> originally thought. > > >> > > >> Aaron > > >> > > >> > > >> On Fri, Oct 4, 2013 at 4:26 PM, Aaron McCurry <[email protected]> > > wrote: > > >> > > >> > > > >> > > > >> > On Fri, Oct 4, 2013 at 7:15 AM, Ravikumar Govindarajan < > > >> > [email protected]> wrote: > > >> > > > >> >> On a related note, do you think such an approach will fit in Blur > > >> >> > > >> >> 1. Store the BDB file in shard-server itself. > > >> >> > > >> > > > >> > Probably not, this would pin the BDB (or whatever the solution would > > be) > > >> > to a specific server. We will have to sync to HDFS. > > >> > > > >> > > > >> >> > > >> >> 2. Apply all incoming partial doc-updates to local BDB file as well > > as > > >> an > > >> >> update-transaction log > > >> >> > > >> > > > >> > Blur already has a write ahead log as apart of internals. It's > > written > > >> > and synced to HDFS. > > >> > > > >> > > > >> >> > > >> >> 3. Periodically sync dirty BDB files to HDFS and roll-over the > > update- > > >> >> transaction log. > > >> > > > >> > > > >> >> Whenever a shard-server goes down, the take-over server can > initially > > >> sync > > >> >> the BDB file from HDFS to local, replay the update-transaction log > > and > > >> >> then > > >> >> start serving data > > >> >> > > >> > > > >> > Blur already does this internally, it records the mutates and > replays > > >> them > > >> > if a failure happens before a commit. > > >> > > > >> > Aaron > > >> > > > >> > > > >> >> > > >> >> -- > > >> >> Ravi > > >> >> > > >> >> > > >> >> On Thu, Oct 3, 2013 at 11:14 PM, Ravikumar Govindarajan < > > >> >> [email protected]> wrote: > > >> >> > > >> >> > The mutate APIs are a good fit for individual cols update. > > BlurCodec > > >> >> will > > >> >> > be cool and solve a lot of problems. > > >> >> > > > >> >> > There are 3 caveats for such a codec > > >> >> > > > >> >> > 1. Scores for affected queries will be wrong, until segment-merge > > >> >> > > > >> >> > 2. Responsibility of ordering updates must be on the client. > > >> >> > > > >> >> > 3. Repeated updates for the same document can either take a > > >> generational > > >> >> > approach [Lucene-4258] or use a single version of storage > [Redis/TC > > >> >> etc..], > > >> >> > pushing the onus to client, depending on how the Codec shapes up. > > >> >> > > > >> >> > The former will be semantically correct but really sluggish while > > the > > >> >> > latter will be faster during search > > >> >> > > > >> >> > > > >> >> > > > >> >> > On Thu, Oct 3, 2013 at 8:53 PM, Aaron McCurry < > [email protected]> > > >> >> wrote: > > >> >> > > > >> >> >> On Thu, Oct 3, 2013 at 11:08 AM, Ravikumar Govindarajan < > > >> >> >> [email protected]> wrote: > > >> >> >> > > >> >> >> > Yeah, you are correct. A BDB file might probably never be > ported > > >> to > > >> >> >> HDFS. > > >> >> >> > > > >> >> >> > Our daily update frequency comes to about 20% of insertion > rate. > > >> >> >> > > > >> >> >> > Lets say "UPDATE <TABLE> SET COL2=1 WHERE COL1=X". > > >> >> >> > > > >> >> >> > This update could potentially span across tens of thousands of > > SQL > > >> >> rows > > >> >> >> in > > >> >> >> > our case, where COL2 is just a boolean flip. > > >> >> >> > > > >> >> >> > The problem is not with lucene's ability to handle load. > Instead > > >> it > > >> >> is > > >> >> >> with > > >> >> >> > the consistent load it puts on our content servers to read and > > >> >> >> re-tokenize > > >> >> >> > such huge rows just for a boolean flip. Another big winner is > > that > > >> >> all > > >> >> >> our > > >> >> >> > updatable fields are not involved in scoring at all. Just > > matching > > >> >> will > > >> >> >> do. > > >> >> >> > > > >> >> >> > The changes also sit in BDB only till the next segment merge, > > >> after > > >> >> >> which > > >> >> >> > it is cleaned out. There is very little perf hit here for us, > as > > >> >> users > > >> >> >> > don't immediately search after a change. > > >> >> >> > > > >> >> >> > I am afraid there is no documentation/code/numbers on this > > >> currently > > >> >> in > > >> >> >> > public, as it is still proprietary but is remarkably similar > to > > >> the > > >> >> >> popular > > >> >> >> > to RedisCodec. > > >> >> >> > > > >> >> >> > "If you really need partial document updates, there would need > > to > > >> be > > >> >> >> > changes > > >> >> >> > throughout the entire stack" > > >> >> >> > > > >> >> >> > You mean, the entire stack of Blur? In case this is possible, > > can > > >> you > > >> >> >> give > > >> >> >> > me 10000-ft overview of what you have in mind? > > >> >> >> > > > >> >> >> > > >> >> >> Interesting, now that I think about it. The situation that you > > >> >> describe > > >> >> >> is > > >> >> >> very interesting, I'm wondering if we came up with something > like > > >> this > > >> >> in > > >> >> >> Blur that it would fix our large Row issue. Or at the very > least > > >> help > > >> >> the > > >> >> >> problem. > > >> >> >> > > >> >> >> https://issues.apache.org/jira/browse/BLUR-220 > > >> >> >> > > >> >> >> Plus the more I think about it, the mutate methods are probably > > the > > >> >> right > > >> >> >> implementation for modifying single columns. So the API of Blur > > >> >> probably > > >> >> >> wouldn't need to be changed. Maybe just the way it goes about > > >> dealing > > >> >> >> with > > >> >> >> changes. I thinking maybe we need our own BlurCodec to handle > > large > > >> >> Rows > > >> >> >> as well as Record (Document) updates. > > >> >> >> > > >> >> >> As an aside I constantly am having to refer to Records as > > Documents, > > >> >> this > > >> >> >> is why I think we need a rename. > > >> >> >> > > >> >> >> Aaron > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > > >> >> >> > -- > > >> >> >> > Ravi > > >> >> >> > > > >> >> >> > > > >> >> >> > On Thu, Oct 3, 2013 at 5:36 PM, Aaron McCurry < > > [email protected] > > >> > > > >> >> >> wrote: > > >> >> >> > > > >> >> >> > > The biggest issue with this is that the shards (the indexes) > > >> >> inside of > > >> >> >> > Blur > > >> >> >> > > actually move from one server to another. So to support > this > > >> >> behavior > > >> >> >> > all > > >> >> >> > > the indexes are stored in HDFS. Do due the differences > > between > > >> >> HDFS > > >> >> >> and > > >> >> >> > > the a normal POSIX file system, I highly doubt that the BDB > > file > > >> >> form > > >> >> >> in > > >> >> >> > > TokyoCabinet can ever be supported. > > >> >> >> > > > > >> >> >> > > If you really need partial document updates, there would > need > > >> to be > > >> >> >> > changes > > >> >> >> > > throughout the entire stack. I am curious why you need this > > >> >> feature? > > >> >> >> Do > > >> >> >> > > you have that many updates to the index? What is the update > > >> >> >> frequency? > > >> >> >> > > I'm just curious of what kind of performance you get out > of a > > >> >> setup > > >> >> >> like > > >> >> >> > > that? Since I haven't ever run such a setup I have no idea > > how > > >> to > > >> >> >> > compare > > >> >> >> > > that kind of system to a base Lucene setup. > > >> >> >> > > > > >> >> >> > > Could you point be to some code or documentation? I would > to > > go > > >> >> and > > >> >> >> > take a > > >> >> >> > > look. > > >> >> >> > > > > >> >> >> > > Thanks, > > >> >> >> > > Aaron > > >> >> >> > > > > >> >> >> > > > > >> >> >> > > > > >> >> >> > > On Thu, Oct 3, 2013 at 7:00 AM, Ravikumar Govindarajan < > > >> >> >> > > [email protected]> wrote: > > >> >> >> > > > > >> >> >> > > > One more help. > > >> >> >> > > > > > >> >> >> > > > We also maintain a file by name "BDB", just like the > > "Sample" > > >> >> file > > >> >> >> for > > >> >> >> > > > tracing used by Blur. > > >> >> >> > > > > > >> >> >> > > > This "BDB" file pertains to TokyoCabinet and is used > purely > > >> for > > >> >> >> > > supporting > > >> >> >> > > > partial updates to a document. > > >> >> >> > > > All operations on this file rely on local file-paths only, > > >> >> through > > >> >> >> the > > >> >> >> > > use > > >> >> >> > > > of native code. > > >> >> >> > > > Currently, all update requests are local to the index > files > > >> and > > >> >> it > > >> >> >> > > becomes > > >> >> >> > > > trivial to support. > > >> >> >> > > > > > >> >> >> > > > Any pointers on how to take this forward in Blur set-up of > > >> >> >> > shard-servers > > >> >> >> > > & > > >> >> >> > > > controllers? > > >> >> >> > > > > > >> >> >> > > > -- > > >> >> >> > > > Ravi > > >> >> >> > > > > > >> >> >> > > > > > >> >> >> > > > On Tue, Oct 1, 2013 at 10:15 PM, Aaron McCurry < > > >> >> [email protected]> > > >> >> >> > > wrote: > > >> >> >> > > > > > >> >> >> > > > > You can control the fields to warmup via: > > >> >> >> > > > > > > >> >> >> > > > > > > >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > >> >> >> > > >> >> > > >> > > > http://incubator.apache.org/blur/docs/0.2.0/Blur.html#Struct_TableDescriptor > > >> >> >> > > > > > > >> >> >> > > > > The preCacheCols field. The comment is wrong however, > so > > I > > >> >> will > > >> >> >> > > create a > > >> >> >> > > > > task to correct. The use of the field is: > "family.column" > > >> just > > >> >> >> like > > >> >> >> > > you > > >> >> >> > > > > would search. > > >> >> >> > > > > > > >> >> >> > > > > Aaron > > >> >> >> > > > > > > >> >> >> > > > > > > >> >> >> > > > > On Tue, Oct 1, 2013 at 12:41 PM, Ravikumar Govindarajan > < > > >> >> >> > > > > [email protected]> wrote: > > >> >> >> > > > > > > >> >> >> > > > > > Thanks Aaron > > >> >> >> > > > > > > > >> >> >> > > > > > General sampling and warming is fine and the code is > > >> really > > >> >> >> concise > > >> >> >> > > and > > >> >> >> > > > > > clear. > > >> >> >> > > > > > > > >> >> >> > > > > > The act of reading > > >> >> >> > > > > > brings the data into the block cache and the result is > > >> that > > >> >> the > > >> >> >> > index > > >> >> >> > > > is > > >> >> >> > > > > > "hot". > > >> >> >> > > > > > > > >> >> >> > > > > > Will all the terms of a field be read and brought into > > the > > >> >> >> cache? > > >> >> >> > If > > >> >> >> > > > so, > > >> >> >> > > > > > then it has an obvious implication to avoid fields > like, > > >> say > > >> >> >> > > > > > attachment-data from warming up, provided queries > don't > > >> often > > >> >> >> > include > > >> >> >> > > > > such > > >> >> >> > > > > > fields > > >> >> >> > > > > > > > >> >> >> > > > > > > > >> >> >> > > > > > On Tue, Oct 1, 2013 at 7:58 PM, Aaron McCurry < > > >> >> >> [email protected]> > > >> >> >> > > > > wrote: > > >> >> >> > > > > > > > >> >> >> > > > > > > Take a look at this package. > > >> >> >> > > > > > > > > >> >> >> > > > > > > > > >> >> >> > > > > > > > > >> >> >> > > > > > > > >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > >> >> >> > > >> >> > > >> > > > https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=tree;f=blur-store/src/main/java/org/apache/blur/lucene/warmup;h=f4239b1947965dc7fe8218eaa16e3f39ecffdda0;hb=apache-blur-0.2 > > >> >> >> > > > > > > > > >> >> >> > > > > > > Basically when the warmup process starts (which is > > >> >> >> asynchronous > > >> >> >> > to > > >> >> >> > > > the > > >> >> >> > > > > > rest > > >> >> >> > > > > > > of the application) it flips a thread local switch > to > > >> allow > > >> >> >> for > > >> >> >> > > > tracing > > >> >> >> > > > > > of > > >> >> >> > > > > > > the file accesses. The sampler will sample each of > > the > > >> >> >> fields in > > >> >> >> > > > each > > >> >> >> > > > > > > segment and create a sample file that attempts to > > detect > > >> >> the > > >> >> >> > > > boundaries > > >> >> >> > > > > > of > > >> >> >> > > > > > > each field within each file within each segment. > Then > > >> it > > >> >> >> stores > > >> >> >> > > the > > >> >> >> > > > > > sample > > >> >> >> > > > > > > info into the directory beside each segment (so that > > >> way it > > >> >> >> > doesn't > > >> >> >> > > > > have > > >> >> >> > > > > > to > > >> >> >> > > > > > > re-sample the segment). After the sampling is > > complete > > >> or > > >> >> >> > loaded, > > >> >> >> > > > the > > >> >> >> > > > > > > warmup just reads the binary data from each file. > The > > >> act > > >> >> of > > >> >> >> > > reading > > >> >> >> > > > > > > brings the data into the block cache and the result > is > > >> that > > >> >> >> the > > >> >> >> > > index > > >> >> >> > > > > is > > >> >> >> > > > > > > "hot". > > >> >> >> > > > > > > > > >> >> >> > > > > > > Hope this helps. > > >> >> >> > > > > > > > > >> >> >> > > > > > > Aaron > > >> >> >> > > > > > > > > >> >> >> > > > > > > > > >> >> >> > > > > > > > > >> >> >> > > > > > > > > >> >> >> > > > > > > On Tue, Oct 1, 2013 at 10:09 AM, Ravikumar > > Govindarajan > > >> < > > >> >> >> > > > > > > [email protected]> wrote: > > >> >> >> > > > > > > > > >> >> >> > > > > > > > As I understand, > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > Lucene will store the files in following way > > >> per-segment > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > TIM file > > >> >> >> > > > > > > > Field1 ---> Some byte[] > > >> >> >> > > > > > > > Field2 ---> Some byte[] > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > TIP file > > >> >> >> > > > > > > > Field1 ---> Some byte[] > > >> >> >> > > > > > > > Field2 ---> Some byte[] > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > Blur will "sample" this lucene-file in the > following > > >> way > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > Field1 --> <TIM, start-offset>, <TIP, > start-offset>, > > >> ... > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > Field 2 --> <TIM, start-offset>, <TIP, > > start-offset>, > > >> ... > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > Is my understanding correct? > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > How does Blur warm-up the fields, when it does not > > >> know > > >> >> the > > >> >> >> > > > > > "end-offset" > > >> >> >> > > > > > > or > > >> >> >> > > > > > > > the "length" for each field to warm. > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > Will it by default read all Terms of a field? > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > -- > > >> >> >> > > > > > > > Ravi > > >> >> >> > > > > > > > > > >> >> >> > > > > > > > > >> >> >> > > > > > > > >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > >> >> >> > > >> >> > > > >> >> > > > >> >> > > >> > > > >> > > > >> > > > > > > > > >
