RE: Error messages.

2003-03-06 Thread berthiaume_wayne
Originally, I was -W0 with fsync(2) being used to insure data
integrity. I'm presently testing lk 2.4.19 + Namesys patches 1 thru 13 +
Chris Mason's write barrier patch with hdparm -W1 and fsync(2). Under this
configuration I don't see the problem you are encountering, but am
investigating data coruption on the ReiserFS partitions.

-Original Message-
From: Anders Widman [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 06, 2003 9:20 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Error messages.


> Anders, here is what I have and it works on thousands of duplicate
> servers:

> Tyan S2420 with 1.0GHz PIII
> 512MB RAM
> Promise PDC20269 in PCI1

Using PDC20268

> Intel Dual 10/100 NIC in PCI2
> Four Maxtor 250GB IDE drives off of the Promise controller
> lk 2.4.19 on RH7.3

> hdparm -a64 -K1 -W1 -u1 -m16 -c1 -d1 /dev/hd

hm.. The big difference I see is -that I normally use -c3.




RE: Error messages.

2003-03-06 Thread berthiaume_wayne
Cable length is similar to mine. The PDC20268 will only go to UDMA
5. I haven't done any testing with this controller, needed the PDC20269's
UDMA 6 capability.

-Original Message-
From: Anders Widman [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 06, 2003 8:52 AM
To: [EMAIL PROTECTED]
Subject: Re: Error messages.


> That's rather puzzling... I did not have the same problems with
the
> mii driver; however, I was unable to run the full extent of the 250GB
drive
> or the UDMA level 6 with mii under 2.4.13, so I was using a special
patched
> driver form Promise to support both the pdc20269 and 48LBA. In 2.4.19 the
48
> LBA was added so I was able to get the full address range on the 250GB
> drives without patches from Promise; however, was still unable to run UDMA
> level 6 on the onboard Intel chip.

UDMA6  works on the machine with the VIA KT400 chip and 2.4.21 kernel.
The  other machines are limited to ATA-100 as the controllers does not
support  higher.  Actually  I  do  not need high DMA, DMA-33 should be
enough.

Though,  the  errors come even with DMA turned off. It seem though, at
least  so  far,  that  the system crashes/lockups come much more often
with DMA than without.

> I still use the Promise pdc20269 and run UDMA level 6 on thousands
> of deployed servers at this time. What is the cable length from drives to
> controller? Eventhough you have several configured servers, I have
thousands
> without the problem you are seeing. Yes, I do get an occasional status
error
> under heavy loads but they've always been recoverable and the systems
> continue to chug along.

Cables are between 40-45cm / 15,5-17in.


PGP public key: https://tnonline.net/secure/pgp_key.txt


RE: Error messages.

2003-03-06 Thread berthiaume_wayne
Anders, here is what I have and it works on thousands of duplicate
servers:

Tyan S2420 with 1.0GHz PIII
512MB RAM
Promise PDC20269 in PCI1
Intel Dual 10/100 NIC in PCI2
Four Maxtor 250GB IDE drives off of the Promise controller
lk 2.4.19 on RH7.3

hdparm -a64 -K1 -W1 -u1 -m16 -c1 -d1 /dev/hd

Regards,
Wayne.

-Original Message-
From: Anders Widman [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 06, 2003 3:46 AM
To: [EMAIL PROTECTED]
Subject: Re: Error messages.


> On Wed, 2003-03-05 at 21:51, Anders Widman wrote:
>> > On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:
>>New Promise controllers
>>PDC20268 (Ultra 100Tx2)

> does that mean you only tested on these pdc's ?

I  changed  from  Three  Ultra100  to  Ultra100Tx2. Now I only use two
boards in this particular system.

> If so then then drop this damn PDC controller and get one that is
> supported under linux (e.g. hpt370 based controllers).

> I had the very same problems with these PDC20268 controllers. When I
> switched to anything above MDMA0 (note not even UDMA) the system was
> freezing from time to time.

This happens here too...

> On the internal controller your drives should work all fine (via/intel
> chipsets work nicely), also on hpt based chipsets and also cmd is
> supporting linux... but forget about promise. This company just does not
> support linux.

> I was using kernels 2.4.19/20/21pre1/21pre4/21pre4-ac5 and all had the
> very same problem. When I heard from others that they had problems with
> promise I switched... and I am now enjoying a rock stable system.

It  might  just  have  to  come  to this, but I do not want to buy new
hardware :)

> Soeren.


   




PGP public key: https://tnonline.net/secure/pgp_key.txt


RE: Error messages.

2003-03-06 Thread berthiaume_wayne
That's rather puzzling... I did not have the same problems with the
mii driver; however, I was unable to run the full extent of the 250GB drive
or the UDMA level 6 with mii under 2.4.13, so I was using a special patched
driver form Promise to support both the pdc20269 and 48LBA. In 2.4.19 the 48
LBA was added so I was able to get the full address range on the 250GB
drives without patches from Promise; however, was still unable to run UDMA
level 6 on the onboard Intel chip. 
I still use the Promise pdc20269 and run UDMA level 6 on thousands
of deployed servers at this time. What is the cable length from drives to
controller? Eventhough you have several configured servers, I have thousands
without the problem you are seeing. Yes, I do get an occasional status error
under heavy loads but they've always been recoverable and the systems
continue to chug along.

-Original Message-
From: Anders Widman [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 06, 2003 3:44 AM
To: [EMAIL PROTECTED]
Subject: Re: Error messages.


> Hello!

> On Thu, Mar 06, 2003 at 09:32:38AM +0100, Anders Widman wrote:

>> > And for this case I am sure this was a scratchy CD-ROM disk in my
CD-ROM drive.
>>Well, have no CD-ROM. :)

> /dev/hdg is one of my CD-ROMs ;)

>> > Probably same stuff can be get when drive is busy remapping bad
sectors?
>> > Use smartctl to find out how these messages corellate with remapped bad
sectors counts?
>>   Very strange. Would mean all of my harddrives would be broken, or on
>>   their  way  to  get  broken.  I  do  not  believe that.  Most of the
>>   hardware, including the cabling has been replaced and changed.

> Well, seems as Wayne have noticed, you have one common part:
> Promise controllers. How about using different kind of controller
> on one of the boxes and see if it helps?

Perhaps,  but  the  same  happens on the internal controller. In fact,
the  internal  controller  (either VIA or the Intel) causes the system
to freeze when it happens to many times.

> Bye,
> Oleg


   




PGP public key: https://tnonline.net/secure/pgp_key.txt


RE: reiserfsprogs 3.6.5-pre2 release.

2003-02-25 Thread berthiaume_wayne
Vitaly, how does "check-followed-by-fixable" in 3.6.3 compare to
"-a" at boot in 3.6.5? 
Regards,
Wayne.

-Original Message-
From: Vitaly Fertman [mailto:[EMAIL PROTECTED]
Sent: Tuesday, February 25, 2003 8:50 AM
To: [EMAIL PROTECTED]
Subject: reiserfsprogs 3.6.5-pre2 release.



Hi all!

this new pre-release includes:

- a critical bug on pass0 of rebuild-tree with overflowing while checking 
unformatted item offsets was fixed.
- a bug in relocation of shared object id - entry key sometimes does not 
get updated correctly with the new key - was fixed.
- a bug in bitops operations for be mashins was fixed. 
- a bug with the superblock overwriting during replaying was fixed.

- while openning the journal check that journal parameters in the superblock

and in the journal header mathches; advice to run rebuild-sb if not. While 
rebuilding the superblock, do the same check and ask the user if he wants to

rebuild the journal header or continue w/out the journal or he wants to
change 
the start of the partition before using reiserfsck.
- check that all not valid bits of the bitmap are set to 1, set it
correctly. 

- fix-fixable does not relocate shared object ids anymore, as it is too
complex 
for fix-fixable and only rebuild-tree does.

- reiserfsck -a (started at boot) replays journal, checks error flags in 
the superblock, bitmaps, fs size, 2 levels of internal tree and switches 
to fixble mode if any problem is detected. For root fs fixable cannot be 
performed (as fs is mounted) and just --check will be done.

- Journal replay was improved a) check blocks if they could be journable 
before replaying; b) replay only transactions which has trans_id == last 
replayed transaction trans_id + 1.

- warning messages were improved.

-- 

Thanks,
Vitaly Fertman


[reiserfs-list] Data Shredding on a Journal Filesystem

2002-09-24 Thread berthiaume_wayne

Hello fellow ReiserFS fans. I'm in search of a data shredder for use
on reiserfs and am wondering if anyone knows of one. It would need to
effectively remove any trace both in the journal and on the disk itself any
and all data pertaining to a file. I'm not sure, but I thought at one time
Hans was talking about something this himself. I have looked at shred() but
it does not work with journalling filesystems.
Most appreciatively,
Wayne





RE: [reiserfs-list] RE: lk 2.4.19 ReiserFS Build

2002-08-14 Thread berthiaume_wayne

Oleg, does this include the "speedup" series both you and Chris have
been working on?
Regards,
Wayne.

-Original Message-
From: Oleg Drokin [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, August 13, 2002 1:34 AM
To: Manuel Krause
Cc: reiserfs-list
Subject: Re: [reiserfs-list] RE: lk 2.4.19 ReiserFS Build


Hello!

On Tue, Aug 13, 2002 at 09:17:06AM +0400, Oleg Drokin wrote:
> > On 08/12/2002 08:21 PM, [EMAIL PROTECTED] wrote:
> > >   Thanks Chris. Should I pull the 2.4.20-pre series from Namesys?
> > ^^^ 
> > 
> >  >  this special Namesys series
> bk://thebsh.namesys.com/bk/reiser3-linux-2.4
> But it won't last there for long since it was not accepted by Marcelo it
seems. 

Oh, wait, it was indeed accepted by Marcelo, as I see now.
So you can pull it from Marcelo's tree as well.

I am still going to port it to 2.5, though.

Bye,
Oleg



RE: [reiserfs-list] RE: lk 2.4.19 ReiserFS Build

2002-08-14 Thread berthiaume_wayne

Hi Manuel. That was one of the problems I was hoping to find an
answer for, but it appears, as Oleg has mentioned, the patches are being
tested in 2.5 first before Marcelo will accept them in 2.4. 
Regards,
Wayne.

-Original Message-
From: Manuel Krause [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, August 13, 2002 12:06 AM
To: [EMAIL PROTECTED]
Cc: reiserfs-list
Subject: Re: [reiserfs-list] RE: lk 2.4.19 ReiserFS Build


Hi Wayne!

Please, let me immediately know when you've received this mail where 
you've found:

On 08/12/2002 08:21 PM, [EMAIL PROTECTED] wrote:
>   Thanks Chris. Should I pull the 2.4.20-pre series from Namesys?
 ^^^ 

  >  this special Namesys series


> These were to be the speedup series, if memory serves me correctly. I'd
like
> to get the best performance possible for our next production run.
> Wayne.
> 

[snip]


Thank you in advance, ;-)


Sorry, Wayne, I know what you mean. Your previously reviewed patchset 
from one of my latest mails works fine on here for me until now 
(but,including the not mentionned rml-preempt-patch for the latest -rc- 
and for non-server-usage),

best regards,

Manuel



RE: [reiserfs-list] fsync() Performance Issue

2002-05-06 Thread berthiaume_wayne

I'll add the write caching into the test just for info. Until there
is a way to guaranty the data is safe I'll have to go with no write caching
though. I should have all this testing done by the end of the week.

-Original Message-
From: Chris Mason [mailto:[EMAIL PROTECTED]]
Sent: Friday, May 03, 2002 6:00 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: [reiserfs-list] fsync() Performance Issue


On Fri, 2002-05-03 at 16:35, [EMAIL PROTECTED] wrote:
>   Chris, I have some quick preliminary results for you. I have
> additional testing to perform and haven't run debugreiserfs() yet. If you
> have a preference for which tests to run debugreiserfs() let me know.
>   Base testing was done against 2.4.13 built on RH 7.1 using the
> test_writes.c code I forwarded to you. The system is a Tyan with single
> PIII, IDE Promise 20269, Maxtor 160GB drive - write cache disabled. All
> numbers are with fsync() and 1KB files. As I said, more testing, i.e.
> filesizes, need to be performed.

> 2.4.19-pre7 speedup, data logging, write barrier / no options
>   => 47.1ms/file

Hi Wayne, thanks for sending these along.

I expected a slight improvement over the 2.4.13 code even with the data
logging turned off.  I'm curious to see how it does with the IDE cache
turned on.  With scsi, I see 10-15% better without any options than an
unpatched kernel.

> 2.4.19-pre7 speedup, data logging, write barrier / data=journal
>   => 25.2ms/file
> 2.4.19-pre7 speedup, data logging, write barrier /
data=journal,barrier=none
>   => 27.8ms/file

The barrier option doesn't make much difference because the write cache
is off.  With write cache on, the barrier code should allow you to be
faster than with the caching off, but without risking the data (Jens and
I are working on final fsync safety issues though).

Hans, data=journal turns on the data journaling.  The data journaling
patches also include optimizations to write metadata back to disk in
bigger chunks for tiny transactions (the current method is to write one
transaction's worth back, when a transaction has 3 blocks, this is
pretty slow).

I've put these patches up on:

ftp.suse.com/pub/people/mason/patches/data-logging

>   One question is will these patches be going into the 2.4 tree and
> when?

The data logging patches are a huge change, but the good news is they
are based on the nesting patches that have been stable for a long time
in the quota code.  I'll probably want a month or more of heavy testing
before I think about submitting them.

-chris




RE: [reiserfs-list] fsync() Performance Issue

2002-04-30 Thread berthiaume_wayne

Thanks. I'll start putting this one into test.
Wayne.

-Original Message-
From: Chris Mason [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 30, 2002 10:28 AM
To: Oleg Drokin
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [reiserfs-list] fsync() Performance Issue


On Tue, 2002-04-30 at 10:20, Oleg Drokin wrote:

> Attached is a speedup patch for 2.4.19-pre7 that should help your fsync
> operations a little. (From Chris Mason).
> Filesystem cannot do very much at this point unfortunatelly, it is ending
up
> waiting for disk to finish write operations.
> 
> Also we are working on other speedup patches that would cover different
areas
> of write perfomance itself.

A newer one (against 2.4.19-pre7) is below.  It has not been through as
much testing on the namesys side, which is why Oleg sent the older one.

Wayne and I have been talking in private mail, he's getting a bunch of
beta patches later today (this speedup, data logging, updated barrier
code).  Along with instructions for testing.

-chris

# Veritas (Hugh Dickins supplied the patch) sent the bits in
# fs/super.c that allow the FS to leave super->s_dirt set after a
# write_super call.
#
diff -urN --exclude *.orig parent/fs/buffer.c comp/fs/buffer.c
--- parent/fs/buffer.c  Mon Apr 29 10:20:24 2002
+++ comp/fs/buffer.cMon Apr 29 10:20:22 2002
@@ -325,6 +325,8 @@
lock_super(sb);
if (sb->s_dirt && sb->s_op && sb->s_op->write_super)
sb->s_op->write_super(sb);
+   if (sb->s_op && sb->s_op->commit_super)
+   sb->s_op->commit_super(sb);
unlock_super(sb);
unlock_kernel();
 
@@ -344,7 +346,7 @@
lock_kernel();
sync_inodes(dev);
DQUOT_SYNC(dev);
-   sync_supers(dev);
+   commit_supers(dev);
unlock_kernel();
 
return sync_buffers(dev, 1);
diff -urN --exclude *.orig parent/fs/reiserfs/bitmap.c
comp/fs/reiserfs/bitmap.c
--- parent/fs/reiserfs/bitmap.c Mon Apr 29 10:20:24 2002
+++ comp/fs/reiserfs/bitmap.c   Mon Apr 29 10:20:19 2002
@@ -122,7 +122,6 @@
   set_sb_free_blocks( rs, sb_free_blocks(rs) + 1 );
 
   journal_mark_dirty (th, s, sbh);
-  s->s_dirt = 1;
 }
 
 void reiserfs_free_block (struct reiserfs_transaction_handle *th, 
@@ -433,7 +432,6 @@
   /* update free block count in super block */
   PUT_SB_FREE_BLOCKS( s, SB_FREE_BLOCKS(s) - init_amount_needed );
   journal_mark_dirty (th, s, SB_BUFFER_WITH_SB (s));
-  s->s_dirt = 1;
 
   return CARRY_ON;
 }
diff -urN --exclude *.orig parent/fs/reiserfs/ibalance.c
comp/fs/reiserfs/ibalance.c
--- parent/fs/reiserfs/ibalance.c   Mon Apr 29 10:20:24 2002
+++ comp/fs/reiserfs/ibalance.c Mon Apr 29 10:20:19 2002
@@ -632,7 +632,6 @@
/* use check_internal if new root is an internal node */
check_internal (new_root);
/*&&*/
-   tb->tb_sb->s_dirt = 1;
 
/* do what is needed for buffer thrown from tree */
reiserfs_invalidate_buffer(tb, tbSh);
@@ -950,7 +949,6 @@
 PUT_SB_ROOT_BLOCK( tb->tb_sb, tbSh->b_blocknr );
 PUT_SB_TREE_HEIGHT( tb->tb_sb, SB_TREE_HEIGHT(tb->tb_sb) + 1 );
do_balance_mark_sb_dirty (tb, tb->tb_sb->u.reiserfs_sb.s_sbh, 1);
-   tb->tb_sb->s_dirt = 1;
 }

 if ( tb->blknum[h] == 2 ) {
diff -urN --exclude *.orig parent/fs/reiserfs/journal.c
comp/fs/reiserfs/journal.c
--- parent/fs/reiserfs/journal.cMon Apr 29 10:20:24 2002
+++ comp/fs/reiserfs/journal.c  Mon Apr 29 10:20:21 2002
@@ -64,12 +64,15 @@
 */
 static int reiserfs_mounted_fs_count = 0 ;
 
+static struct list_head kreiserfsd_supers =
LIST_HEAD_INIT(kreiserfsd_supers);
+
 /* wake this up when you add something to the commit thread task queue */
 DECLARE_WAIT_QUEUE_HEAD(reiserfs_commit_thread_wait) ;
 
 /* wait on this if you need to be sure you task queue entries have been run
*/
 static DECLARE_WAIT_QUEUE_HEAD(reiserfs_commit_thread_done) ;
 DECLARE_TASK_QUEUE(reiserfs_commit_thread_tq) ;
+DECLARE_MUTEX(kreiserfsd_sem) ;
 
 #define JOURNAL_TRANS_HALF 1018   /* must be correct to keep the desc and
commit
 structs at 4k */
@@ -576,17 +579,12 @@
 /* lock the current transaction */
 inline static void lock_journal(struct super_block *p_s_sb) {
   PROC_INFO_INC( p_s_sb, journal.lock_journal );
-  while(atomic_read(&(SB_JOURNAL(p_s_sb)->j_wlock)) > 0) {
-PROC_INFO_INC( p_s_sb, journal.lock_journal_wait );
-sleep_on(&(SB_JOURNAL(p_s_sb)->j_wait)) ;
-  }
-  atomic_set(&(SB_JOURNAL(p_s_sb)->j_wlock), 1) ;
+  down(&SB_JOURNAL(p_s_sb)->j_lock);
 }
 
 /* unlock the current transaction */
 inline static void unlock_journal(struct super_block *p_s_sb) {
-  atomic_dec(&(SB_JOURNAL(p_s_sb)->j_wlock)) ;
-  wake_up(&(SB_JOURNAL(p_s_sb)->j_wait)) ;
+  up(&SB_JOURNAL(p_s_sb)->j_lock);
 }
 
 /*
@@ -756,7 +754,6 @@
   atomic_set(&(jl->j_commit_flushing), 0) ;
   wake_up(&(jl->j_commit_wait)) ;
 
-  s->s_dirt = 1 ;
   return 0 ;
 }
 
@@ -1220,7 +12

RE: [reiserfs-list] fsync() Performance Issue

2002-04-29 Thread berthiaume_wayne

Agreed, it would be better to sync to disk after multiple files
rather than serially; however, in the interest of not being concerned of a
power outage during the process, one of the reason the disk cache is
disabled, the choice was to fsync() each write.  

-Original Message-
From: Chris Mason [mailto:[EMAIL PROTECTED]]
Sent: Monday, April 29, 2002 12:46 PM
To: [EMAIL PROTECTED]
Cc: Russell Coker; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [reiserfs-list] fsync() Performance Issue


On Mon, 2002-04-29 at 12:32, Toby Dickenson wrote:

> >One thing that has occurred to me (which has not been previously
discussed as 
> >far as I recall) is the possibility for using sync() instead of fsync()
if 
> >you can accumulate a number of files (and therefore replace many
fsync()'s 
> >with one sync() ).
> 
> I can see
> 
> write to file A
> write to file B
> write to file C
> sync
> 
> might be faster than
> 
> write to file A
> fsync A
> write to file B
> fsync B
> write to file C
> fsync C

Correct.

> 
> but is it possible for it to be faster than
> 
> write to file A
> write to file B
> write to file C
> fsync A
> fsync B
> fsync C

It depends on the rest of the system.  sync() goes through the big lru
list for the whole box, and fsync() goes through the private list for
just that inode.  If you've got other devices or files with dirty data,
case C that you presented will always be the fastest.  For general use,
I like this one the best, it is what the journal code is optimized for.

If files A, B, and C are the only dirty things on the whole box, a
single sync() will be slightly better, mostly due to reduced cpu time.

-chris




[reiserfs-list] fsync() Performance Issue

2002-04-29 Thread berthiaume_wayne

I'm wondering if anyone out there may have some suggestions on how
to improve the performance of a system employing fsync(). I have to be able
to guaranty that every write to my fileserver is on disk when the client has
passed it to the server. Therefore, I have disabled write cache on the disk
and issue an fsync() per file. I'm running 2.4.19-pre7, reiserfs 3.6.25,
without additional patches. I have seen some discussions out here about
various other "speed-up" patches and am wondering if I need to add these to
2.4.19-pre7? And what they are and where can I obtain said patches? Also,
I'm wondering if there is another solution to syncing the data that is
faster than fsync(). Testing, thusfar, has shown a large disparity between
running with and without sync.Another idea is to explore another filesystem,
but I'm not exactly excited by the other journaling filesystems out there at
this time. All ideas will be greatly appreciated.

Wayne
EMC Corp
Centera Engineering
4400 Computer Drive
M/S F213
Westboro,  MA01580

email:   [EMAIL PROTECTED]
voice:   (508) 898-6564
pager: (888) 769-4578  (numeric)
[EMAIL PROTECTED]  (alpha)
fax:  (508) 898-6388

"One man can make a difference, and every man should try."  - JFK

 <>