Re: [jira] Reopened: (LUCENE-1044) Behavior on hard power shutdown

robert engels Tue, 13 Nov 2007 14:35:38 -0800

You might be misreading the results for the mac mini. If you comparethe mac mini with the sync pro, they are what is expected, theunsync'd is roughly the same as the sync'd.

It might be that Apple configures the driver to not allow lazy writesfor the internal drive? Maybe for reliability.

Or it might be that the internal drive is severely fragmented - sobeing able to coalesce blocks doesn't help much.

I have a mac mini as well, and find the writes to the externalfirewire drive much faster.


XBench shows for my mac mini internal drive

Results 29.37   
        System Info             
                Xbench Version          1.3
                System Version          10.4.10 (8R2232)
                Physical RAM            2048 MB
                Model           Macmini1,1
                Drive Type              ST98823AS
        Disk Test       29.37   
                Sequential      43.69   
                        Uncached Write  42.43   26.05 MB/sec [4K blocks]
                        Uncached Write  43.26   24.48 MB/sec [256K blocks]
                        Uncached Read   50.58   14.80 MB/sec [4K blocks]
                        Uncached Read   39.83   20.02 MB/sec [256K blocks]
                Random  22.12   
                        Uncached Write  7.52    0.80 MB/sec [4K blocks]
                        Uncached Write  50.36   16.12 MB/sec [256K blocks]
                        Uncached Read   67.14   0.48 MB/sec [4K blocks]
                        Uncached Read   76.26   14.15 MB/sec [256K blocks]

For the external firewire

Results 44.36   
        System Info             
                Xbench Version          1.3
                System Version          10.4.10 (8R2232)
                Physical RAM            2048 MB
                Model           Macmini1,1
                Drive Type              ST350063 0A
        Disk Test       44.36   
                Sequential      53.50   
                        Uncached Write  47.01   28.86 MB/sec [4K blocks]
                        Uncached Write  56.23   31.82 MB/sec [256K blocks]
                        Uncached Read   44.11   12.91 MB/sec [4K blocks]
                        Uncached Read   76.72   38.56 MB/sec [256K blocks]
                Random  37.89   
                        Uncached Write  13.94   1.48 MB/sec [4K blocks]
                        Uncached Write  70.45   22.55 MB/sec [256K blocks]
                        Uncached Read   92.09   0.65 MB/sec [4K blocks]
                        Uncached Read   113.54  21.07 MB/sec [256K blocks]





On Nov 13, 2007, at 3:54 PM, Michael McCandless (JIRA) wrote:

[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless reopened LUCENE-1044:
----------------------------------------


OK I ran sync/nosync tests across various platforms/IO system.  In
each case I ran the test once with doSync=true and once with
doSync=false, using this alg:

  analyzer=org.apache.lucene.analysis.SimpleAnalyzer
  doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
  docs.file=/lucene/wikifull.txt

  doc.maker.forever=false
  ram.flush.mb = 8
  max.buffered = 0
  directory = FSDirectory
  max.field.length = 2147483647
  doc.term.vector=false
  doc.stored=false
  work.dir = /tmp/lucene
  fsdirectory.dosync = false

  ResetSystemErase
  CreateIndex
  {AddDoc >: 150000
  CloseIndex

  RepSumByName

Ie, time to index the first 150K docs from Wikipedia.


Results for single hard drive:
Mac mini (10.5 Leopard) single 4200 RPM "notebook" (2.5") drive-- 2.3% slower:
      sync - 296.80 sec
    nosync - 290.06 sec

  Mac pro (10.4 Tiger), single external drive -- 35.5% slower:

      sync - 259.61 sec
    nosync - 191.53 sec

  Win XP Pro laptop, single drive -- 38.2% slower

      sync - 536.00 sec
    nosync - 387.90 sec

  Linux (2.6.22.1), ext3 single drive -- 23% slower

      sync - 185.42 sec
    nosync - 150.56 sec

Results for multiple hard drives (RAID arrays):

  Linux (2.6.22.1), reiserfs 6 drive RAID5 array -- 49% slower (!!)

      sync - 239.32 sec
    nosync - 160.56 sec

  Mac Pro (10.4 Tiger), 4 drive RAID0 array -- 1% faster

      sync - 157.26 sec
    nosync - 158.93 sec


So at this point I'm torn...

The performance cost of the simplest approach (sync() before close())
is very costly in many cases (not just laptop IO subsystems).  The
reiserfs test was rather shocking.  Then, it's oddly very lost cost in
other cases: the Mac Mini test I find amazing.

It's frustrating to lose such performance "out of the box" for the
presumably extremely rare event of OS/machine crash/power cut.

Maybe we should leave the default as false for now?
Behavior on hard power shutdown
-------------------------------

                Key: LUCENE-1044
URL: https://issues.apache.org/jira/browse/LUCENE-1044
            Project: Lucene - Java
         Issue Type: Bug
         Components: Index
Environment: Windows Server 2003, Standard Edition, SunHotspot Java 1.5
           Reporter: venkat rangan
           Assignee: Michael McCandless
            Fix For: 2.3
Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch,LUCENE-1044.take3.patch
When indexing a large number of documents, upon a hard powerfailure (e.g. pull the power cord), the index seems to getcorrupted. We start a Java application as an Windows Service, andfeed it documents. In some cases (after an index size of 1.7GB,with 30-40 index segment .cfs files) , the following is observed.The 'segments' file contains only zeros. Its size is 265 bytes -all bytes are zeros.The 'deleted' file also contains only zeros. Its size is 85 bytes- all bytes are zeros.Before corruption, the segments file and deleted file appear to becorrect. After this corruption, the index is corrupted and lost.This is a problem observed in Lucene 1.4.3. We are not able toupgrade our customer deployments to 1.9 or later version, butwould be happy to back-port a patch, if the patch is small enoughand if this problem is already solved.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [jira] Reopened: (LUCENE-1044) Behavior on hard power shutdown

Reply via email to