You might be misreading the results for the mac mini. If you compare
the mac mini with the sync pro, they are what is expected, the
unsync'd is roughly the same as the sync'd.
It might be that Apple configures the driver to not allow lazy writes
for the internal drive? Maybe for reliability.
Or it might be that the internal drive is severely fragmented - so
being able to coalesce blocks doesn't help much.
I have a mac mini as well, and find the writes to the external
firewire drive much faster.
XBench shows for my mac mini internal drive
Results 29.37
System Info
Xbench Version 1.3
System Version 10.4.10 (8R2232)
Physical RAM 2048 MB
Model Macmini1,1
Drive Type ST98823AS
Disk Test 29.37
Sequential 43.69
Uncached Write 42.43 26.05 MB/sec [4K blocks]
Uncached Write 43.26 24.48 MB/sec [256K blocks]
Uncached Read 50.58 14.80 MB/sec [4K blocks]
Uncached Read 39.83 20.02 MB/sec [256K blocks]
Random 22.12
Uncached Write 7.52 0.80 MB/sec [4K blocks]
Uncached Write 50.36 16.12 MB/sec [256K blocks]
Uncached Read 67.14 0.48 MB/sec [4K blocks]
Uncached Read 76.26 14.15 MB/sec [256K blocks]
For the external firewire
Results 44.36
System Info
Xbench Version 1.3
System Version 10.4.10 (8R2232)
Physical RAM 2048 MB
Model Macmini1,1
Drive Type ST350063 0A
Disk Test 44.36
Sequential 53.50
Uncached Write 47.01 28.86 MB/sec [4K blocks]
Uncached Write 56.23 31.82 MB/sec [256K blocks]
Uncached Read 44.11 12.91 MB/sec [4K blocks]
Uncached Read 76.72 38.56 MB/sec [256K blocks]
Random 37.89
Uncached Write 13.94 1.48 MB/sec [4K blocks]
Uncached Write 70.45 22.55 MB/sec [256K blocks]
Uncached Read 92.09 0.65 MB/sec [4K blocks]
Uncached Read 113.54 21.07 MB/sec [256K blocks]
On Nov 13, 2007, at 3:54 PM, Michael McCandless (JIRA) wrote:
[ https://issues.apache.org/jira/browse/LUCENE-1044?
page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless reopened LUCENE-1044:
----------------------------------------
OK I ran sync/nosync tests across various platforms/IO system. In
each case I ran the test once with doSync=true and once with
doSync=false, using this alg:
analyzer=org.apache.lucene.analysis.SimpleAnalyzer
doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
docs.file=/lucene/wikifull.txt
doc.maker.forever=false
ram.flush.mb = 8
max.buffered = 0
directory = FSDirectory
max.field.length = 2147483647
doc.term.vector=false
doc.stored=false
work.dir = /tmp/lucene
fsdirectory.dosync = false
ResetSystemErase
CreateIndex
{AddDoc >: 150000
CloseIndex
RepSumByName
Ie, time to index the first 150K docs from Wikipedia.
Results for single hard drive:
Mac mini (10.5 Leopard) single 4200 RPM "notebook" (2.5") drive
-- 2.3% slower:
sync - 296.80 sec
nosync - 290.06 sec
Mac pro (10.4 Tiger), single external drive -- 35.5% slower:
sync - 259.61 sec
nosync - 191.53 sec
Win XP Pro laptop, single drive -- 38.2% slower
sync - 536.00 sec
nosync - 387.90 sec
Linux (2.6.22.1), ext3 single drive -- 23% slower
sync - 185.42 sec
nosync - 150.56 sec
Results for multiple hard drives (RAID arrays):
Linux (2.6.22.1), reiserfs 6 drive RAID5 array -- 49% slower (!!)
sync - 239.32 sec
nosync - 160.56 sec
Mac Pro (10.4 Tiger), 4 drive RAID0 array -- 1% faster
sync - 157.26 sec
nosync - 158.93 sec
So at this point I'm torn...
The performance cost of the simplest approach (sync() before close())
is very costly in many cases (not just laptop IO subsystems). The
reiserfs test was rather shocking. Then, it's oddly very lost cost in
other cases: the Mac Mini test I find amazing.
It's frustrating to lose such performance "out of the box" for the
presumably extremely rare event of OS/machine crash/power cut.
Maybe we should leave the default as false for now?
Behavior on hard power shutdown
-------------------------------
Key: LUCENE-1044
URL: https://issues.apache.org/jira/browse/
LUCENE-1044
Project: Lucene - Java
Issue Type: Bug
Components: Index
Environment: Windows Server 2003, Standard Edition, Sun
Hotspot Java 1.5
Reporter: venkat rangan
Assignee: Michael McCandless
Fix For: 2.3
Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch,
LUCENE-1044.take3.patch
When indexing a large number of documents, upon a hard power
failure (e.g. pull the power cord), the index seems to get
corrupted. We start a Java application as an Windows Service, and
feed it documents. In some cases (after an index size of 1.7GB,
with 30-40 index segment .cfs files) , the following is observed.
The 'segments' file contains only zeros. Its size is 265 bytes -
all bytes are zeros.
The 'deleted' file also contains only zeros. Its size is 85 bytes
- all bytes are zeros.
Before corruption, the segments file and deleted file appear to be
correct. After this corruption, the index is corrupted and lost.
This is a problem observed in Lucene 1.4.3. We are not able to
upgrade our customer deployments to 1.9 or later version, but
would be happy to back-port a patch, if the patch is small enough
and if this problem is already solved.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]