[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705675#action_12705675 ]
Jason Rutherglen commented on LUCENE-1313: ------------------------------------------ {quote}I don't like how "deep" the dichotomy of "RAMDir vs FSDir" {quote} Agreed, it's a bit awkward but I don't see another way to do this. The good thing is if IW has written some .fdt files to the main dir (via FSD), IW crashes, then IW is created again, IFD automatically deletes the extraneous .fdt (and other extension) files. {quote}Why can't we push FSD down to all these places (IFD, SegmentInfo/s, etc.)?{quote} {quote}Could we simply make the single CMS instance smart enough to realize that a single RAM merge is allowed to proceed regardless of the thread limit?{quote} Hmm... I think for benchmarking it would be good to allow options as we simply don't know. In the latest patch a ram mergescheduler can be set to the IndexWriter. {quote}have to fix FSD to understand CFX must go to the dir too{quote} I think this is fixed in the patch, where compound files are not created in RAM. {quote} You're saying we should have IW create the ramdir by default after getReader is called and remove the IW ramdir constructor? Right. This should be "under the hood".{quote} Ok, this will require some reworking of the patch. {quote}OK, though I'd like to simply always use FSD, even if primary & secondary are the same dir. {quote} How will always using FSD work? Doesn't it assume writing to two different directories? {quote}this ram size should be used not only for deciding when it's time to merge to a disk segment, but also when it's time for DW to flush a new segment{quote} In the new patch this is fixed. {quote}So if budget is 32 MB, and net RAM used (segments + DW) is say 22, we have a 10 MB "budget", so we are allowed to select merges that total to < 10 MB.{quote} One issue is the ram buffer flush doubles the ram used (because the segment is flushed as is to the RAM dir). You're saying roughly estimate the ram size used on the result of a merge and have the merge policy take this into account? This makes sense, otherwise we will consistently (if temporarily) exceed the ram buffer size. The algorithm is fairly simple? Find segments whose total sizes are lower than whatever we have left of the max ram buffer size? I have new code, but will rework it a bit to include this discussion. > Realtime Search > --------------- > > Key: LUCENE-1313 > URL: https://issues.apache.org/jira/browse/LUCENE-1313 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Affects Versions: 2.4.1 > Reporter: Jason Rutherglen > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, > LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, > LUCENE-1313.patch, LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, > lucene-1313.patch, lucene-1313.patch > > > Realtime search with transactional semantics. > Possible future directions: > * Optimistic concurrency > * Replication > Encoding each transaction into a set of bytes by writing to a RAMDirectory > enables replication. It is difficult to replicate using other methods > because while the document may easily be serialized, the analyzer cannot. > I think this issue can hold realtime benchmarks which include indexing and > searching concurrently. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org