Re: Jinxed VAIO wreckage - current state of affairs

2007-06-10 Thread Thomas Davis

Andrew Morton wrote:

On Sat, 9 Jun 2007 22:59:49 +0200 "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:


Hmm, how's 2.6.22-rc4-mm2 doing on the Vaio?


People would have heard if it was busted ;)

Have seen occasional hangs in e100 resume-from-RAM, and occasional
all-black-and-dead symptoms after resume-from-RAM, but it seems to work at
least 90% of the time.


You doing than I am on my S580p.

if AHCI is loaded, damn thing will not turn off.  Goofy part is, another 
Sony S80p at work does NOT have this probelm - same bios, same drive 
firmware.


Suspend to disk mostly works - sometimes when you return, the screen is 
kinda wonky.


Suspend to ram - going in appears to work, coming out it's dead.

thomas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Jinxed VAIO wreckage - current state of affairs

2007-06-10 Thread Thomas Davis

Andrew Morton wrote:

On Sat, 9 Jun 2007 22:59:49 +0200 Rafael J. Wysocki [EMAIL PROTECTED] wrote:


Hmm, how's 2.6.22-rc4-mm2 doing on the Vaio?


People would have heard if it was busted ;)

Have seen occasional hangs in e100 resume-from-RAM, and occasional
all-black-and-dead symptoms after resume-from-RAM, but it seems to work at
least 90% of the time.


You doing than I am on my S580p.

if AHCI is loaded, damn thing will not turn off.  Goofy part is, another 
Sony S80p at work does NOT have this probelm - same bios, same drive 
firmware.


Suspend to disk mostly works - sometimes when you return, the screen is 
kinda wonky.


Suspend to ram - going in appears to work, coming out it's dead.

thomas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Postgrey experiment at VGER

2006-12-13 Thread Thomas Davis

Dumitru Ciobarcianu wrote:

On Wed, 2006-12-13 at 01:50 +0200, Matti Aarnio wrote:

I do already see spammers smart enough to retry addresses from
the zombie machine, but that share is now below 10% of all emails.
My prediction for next 200 days is that most spammers get the clue,
but it gives us perhaps 3 months of less leaked junk.


IMHO this is only an step in an "arms race".
What you will do in three months, remove this check because it will
prove useless since the spammers will also retry ? If yes, why install
it in the first place ? 





spammers are already re-trying; but they give up after 10 minutes. 
As the delay time increases, the chances of getting on a blacklist 
increase, which makes it easier to identify a machine as a spamming bot.


I normally let my greyfilters run at 30 minutes deny, and 72hrs of 
lease time on a IP/To/From tuplet.  This setting seams to be pretty 
effective in dropping spam; at one point, upto 10k spam vs. a couple 
hundred ham messages.


thomsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Postgrey experiment at VGER

2006-12-13 Thread Thomas Davis

Dumitru Ciobarcianu wrote:

On Wed, 2006-12-13 at 01:50 +0200, Matti Aarnio wrote:

I do already see spammers smart enough to retry addresses from
the zombie machine, but that share is now below 10% of all emails.
My prediction for next 200 days is that most spammers get the clue,
but it gives us perhaps 3 months of less leaked junk.


IMHO this is only an step in an arms race.
What you will do in three months, remove this check because it will
prove useless since the spammers will also retry ? If yes, why install
it in the first place ? 





spammers are already re-trying; but they give up after 10 minutes. 
As the delay time increases, the chances of getting on a blacklist 
increase, which makes it easier to identify a machine as a spamming bot.


I normally let my greyfilters run at 30 minutes deny, and 72hrs of 
lease time on a IP/To/From tuplet.  This setting seams to be pretty 
effective in dropping spam; at one point, upto 10k spam vs. a couple 
hundred ham messages.


thomsa
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Thomas Davis
Nick Piggin wrote:
It is a bit subtle: get_request may only drop the lock and return NULL
(after retaking the lock), if we fail on a memory allocation. If we
just fail due to unavailable queue slots, then the lock is never
dropped. And the mem allocation can't fail because it is a mempool
alloc with GFP_NOIO.
I'm jumping in here, because we have seen this problem on a X86-64 system, 
with 4gb of ram, and SLES9 (2.6.5-7.141)
You can drive the node into this state:
Mem-info:
Node 1 DMA per-cpu: empty
Node 1 Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
Node 1 HighMem per-cpu: empty
Node 0 DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Node 0 Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
Node 0 HighMem per-cpu: empty
Free pages:   10360kB (0kB HighMem)
Active:485853 inactive:421820 dirty:0 writeback:0 unstable:0 free:2590 
slab:10816 mapped:903444 pagetables:2097
Node 1 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB
lowmem_reserve[]: 0 1664 1664
Node 1 Normal free:2464kB min:2468kB low:4936kB high:7404kB active:918440kB 
inactive:710360kB present:1703936kB
lowmem_reserve[]: 0 0 0
Node 1 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB 
present:0kB
lowmem_reserve[]: 0 0 0
Node 0 DMA free:4928kB min:20kB low:40kB high:60kB active:0kB inactive:0kB 
present:16384kB
lowmem_reserve[]: 0 2031 2031
Node 0 Normal free:2968kB min:3016kB low:6032kB high:9048kB active:1024968kB 
inactive:976924kB present:2080764kB
lowmem_reserve[]: 0 0 0
Node 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB 
present:0kB
lowmem_reserve[]: 0 0 0
Node 1 DMA: empty
Node 1 Normal: 46*4kB 19*8kB 9*16kB 4*32kB 1*64kB 0*128kB 1*256kB 1*512kB 
1*1024kB 0*2048kB 0*4096kB = 2464kB
Node 1 HighMem: empty
Node 0 DMA: 4*4kB 4*8kB 1*16kB 2*32kB 3*64kB 4*128kB 2*256kB 1*512kB 1*1024kB 
1*2048kB 0*4096kB = 4928kB
Node 0 Normal: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 3*512kB 
1*1024kB 0*2048kB 0*4096kB = 2968kB
Node 0 HighMem: empty
Swap cache: add 1009224, delete 106245, find 179674/181478, race 0+2
Free swap:   4739812kB
950271 pages of RAM
17513 reserved pages
2788 pages shared
902980 pages swap cached
with processes doing this:
SysRq : Show State
  sibling
 task PC  pid father child younger older
init  D 0100e810 0 1  0 2   (NOTLB)
01007ff81be8 0006  
     
   010002c1d6e0
Call Trace:{try_to_free_pages+283} 
{schedule_timeout+173}
  {process_timeout+0} 
{io_schedule_timeout+82}
  {blk_congestion_wait+141} 
{autoremove_wake_function+0}
  {autoremove_wake_function+0} 
{__alloc_pages+776}
  {read_swap_cache_async+63} 
{swapin_readahead+97}
  {do_swap_page+142} 
{handle_mm_fault+337}
  {do_page_fault+411} {sys_select+1097}
  {sys_select+1311} {error_exit+0}
mg.C.2D 0100e810 0  1971   1955  1972   (NOTLB)
0100e236bc68 0006  
     
  0001 0100816ed360
Call Trace:{try_to_free_pages+283} 
{schedule_timeout+173}
  {process_timeout+0} 
{io_schedule_timeout+82}
  {blk_congestion_wait+141} 
{autoremove_wake_function+0}
  {autoremove_wake_function+0} 
{__alloc_pages+776}
  {do_wp_page+285} {handle_mm_fault+373}
  {do_page_fault+411} {error_exit+0}
mg.C.2S 01007b0a06a0 0  1972   1971  1974   (NOTLB)
0100bc1c1ca0 0006 0010 00010246
  0004c7c0 0100816ec280 00768780 010081f23390
  00018780 0100816ed360
Call Trace:{__alloc_pages+852} 
{__down_interruptible+216}
  {default_wake_function+0} 
{recalc_task_prio+940}
  {__down_failed_interruptible+53}
  {:mosal:.text.lock.mosal_sync+5}
  {:mod_vipkl:VIPKL_EQ_poll+607} 
{:mod_vipkl:VIPKL_EQ_poll_stat+529}
  {:mod_vipkl:VIPKL_ioctl+5144} 
{:mod_vipkl:vipkl_wrap_kernel_ioctl+417}
  {filp_close+126} {sys_ioctl+612}
  {system_call+124}
mg.C.2S 01007b0a18c0 0  1974   19711972 (NOTLB)
0100a3955ca0 0006 0001e7d422e8 01002c9ca550
  0005f138 0100816ec280 00768780 010081f23390
  00018780 0100816ed360
Call Trace:{__alloc_pages+852} 
{__down_interruptible+216}
  {default_wake_function+0} 

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Thomas Davis
Nick Piggin wrote:
It is a bit subtle: get_request may only drop the lock and return NULL
(after retaking the lock), if we fail on a memory allocation. If we
just fail due to unavailable queue slots, then the lock is never
dropped. And the mem allocation can't fail because it is a mempool
alloc with GFP_NOIO.
I'm jumping in here, because we have seen this problem on a X86-64 system, 
with 4gb of ram, and SLES9 (2.6.5-7.141)
You can drive the node into this state:
Mem-info:
Node 1 DMA per-cpu: empty
Node 1 Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
Node 1 HighMem per-cpu: empty
Node 0 DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Node 0 Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
Node 0 HighMem per-cpu: empty
Free pages:   10360kB (0kB HighMem)
Active:485853 inactive:421820 dirty:0 writeback:0 unstable:0 free:2590 
slab:10816 mapped:903444 pagetables:2097
Node 1 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB
lowmem_reserve[]: 0 1664 1664
Node 1 Normal free:2464kB min:2468kB low:4936kB high:7404kB active:918440kB 
inactive:710360kB present:1703936kB
lowmem_reserve[]: 0 0 0
Node 1 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB 
present:0kB
lowmem_reserve[]: 0 0 0
Node 0 DMA free:4928kB min:20kB low:40kB high:60kB active:0kB inactive:0kB 
present:16384kB
lowmem_reserve[]: 0 2031 2031
Node 0 Normal free:2968kB min:3016kB low:6032kB high:9048kB active:1024968kB 
inactive:976924kB present:2080764kB
lowmem_reserve[]: 0 0 0
Node 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB 
present:0kB
lowmem_reserve[]: 0 0 0
Node 1 DMA: empty
Node 1 Normal: 46*4kB 19*8kB 9*16kB 4*32kB 1*64kB 0*128kB 1*256kB 1*512kB 
1*1024kB 0*2048kB 0*4096kB = 2464kB
Node 1 HighMem: empty
Node 0 DMA: 4*4kB 4*8kB 1*16kB 2*32kB 3*64kB 4*128kB 2*256kB 1*512kB 1*1024kB 
1*2048kB 0*4096kB = 4928kB
Node 0 Normal: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 3*512kB 
1*1024kB 0*2048kB 0*4096kB = 2968kB
Node 0 HighMem: empty
Swap cache: add 1009224, delete 106245, find 179674/181478, race 0+2
Free swap:   4739812kB
950271 pages of RAM
17513 reserved pages
2788 pages shared
902980 pages swap cached
with processes doing this:
SysRq : Show State
  sibling
 task PC  pid father child younger older
init  D 0100e810 0 1  0 2   (NOTLB)
01007ff81be8 0006  
     
   010002c1d6e0
Call Trace:8017338b{try_to_free_pages+283} 
80147d0d{schedule_timeout+173}
  80147c50{process_timeout+0} 
8013a292{io_schedule_timeout+82}
  80280efd{blk_congestion_wait+141} 
8013c530{autoremove_wake_function+0}
  8013c530{autoremove_wake_function+0} 
8016ab68{__alloc_pages+776}
  8018573f{read_swap_cache_async+63} 
801781b1{swapin_readahead+97}
  8017834e{do_swap_page+142} 
801796a1{handle_mm_fault+337}
  80123ebb{do_page_fault+411} 801a3259{sys_select+1097}
  801a332f{sys_select+1311} 801122a9{error_exit+0}
mg.C.2D 0100e810 0  1971   1955  1972   (NOTLB)
0100e236bc68 0006  
     
  0001 0100816ed360
Call Trace:8017338b{try_to_free_pages+283} 
80147d0d{schedule_timeout+173}
  80147c50{process_timeout+0} 
8013a292{io_schedule_timeout+82}
  80280efd{blk_congestion_wait+141} 
8013c530{autoremove_wake_function+0}
  8013c530{autoremove_wake_function+0} 
8016ab68{__alloc_pages+776}
  801778ad{do_wp_page+285} 801796c5{handle_mm_fault+373}
  80123ebb{do_page_fault+411} 801122a9{error_exit+0}
mg.C.2S 01007b0a06a0 0  1972   1971  1974   (NOTLB)
0100bc1c1ca0 0006 0010 00010246
  0004c7c0 0100816ec280 00768780 010081f23390
  00018780 0100816ed360
Call Trace:8016abb4{__alloc_pages+852} 
80110ac8{__down_interruptible+216}
  80139280{default_wake_function+0} 
8013531c{recalc_task_prio+940}
  80230d91{__down_failed_interruptible+53}
  a01cc47e{:mosal:.text.lock.mosal_sync+5}
  

Re: Looking for ifenslave.c

2001-06-18 Thread Thomas Davis

PALFFY Daniel wrote:
> 
> On Thu, 14 Jun 2001, Thomas Davis wrote:
> 
> > Guus, there isn't a really official version of it..
> >
> > At http://pdsf.nersc.gov/linux/ifenslave.c is the last version I
> > produced, that works with bonding in v2.2 and v2.4 kernels.
> 
> > Guus Sliepen wrote:
> > >
> > > Hello,
> > >
> > > The Ethernet bonding module is useless without ifenslave.c. I'm making a Debian
> > > package for it, and I have tried to find the "offical" distribution of this
> > > small program. I could not find an authorative source, instead a lot of copies
> > > and patched versions are scattered around the Internet (I maintain a patched
> > > version myself too).
> > >
> > > I would like to combine all the useful extra features and patches into this
> > > Debian package, so if you know of a patched version or maintain one yourself,
> > > please send it to me.
> 
> The only bonding driver and ifenslave that worked for me was the patched
> version from http://sourceforge.net/projects/bonding . It runs fine over a
> quad starfire card, with vlans over it (ben's patch). You might consider
> packaging the ifenslave from that patch, and packaging the bonding driver
> as a kernel patch...
> 

Yea, and that ifenslave won't work with redhat's network setup files,
which has been in place for years.  Notice I'm not on that page?  I
considered it a forked version.  It also does things I talked to becker
about, that is not nice to the system (MII polling as often it does is
bad.)

When I created the first 2.2 bonding patch, I didn't want to have to
rewrite redhat's already in place ifenslave support (from the 2.0.xx
kernel patch).  The ifenslave listed on that page is broken in that
regard.

The original ifenslave bonded a device to a master that was already up
and running;  the master device was used also a xmit device  (so it
routed packets, and sometimes transmitted them).  So, if the master
device died, the slave(s) died with it.  Not good.  Redhat config files
assumed the master was up and running, and you can add a slave to it
without any problems.  The slave device also picks up it's mac address
from the master device.

The version I created, the master device does nothing but route packets
to slaves.  This has a simple problem - no known mac hardware address. 
(ie, it's 0:0:0:0:0:0:0:0) That's bad.  To set a hardware mac address,
you need to down, change the hw mac, and re-up the device.  But,
redhat's scripts already assume the master is up and running, and
downing the master to setup the mac hw means all IP routing information
is lost.  So I added the BOND_SETHWADDR, which allows ifenslave to add a
mac address to the bond master without killing any IP routing
information.  But that's not totally correct either, since adding a mac
hw address can screw up the arp tables (it appears not to, but that's
just plain lucky).

So, in summary, bonding is hack, I strongly dis-agree with what is at
http://sourceforge.net/projects/bonding, but my hands are currently tied
on doing much about it (I could, but I could suffer from consequences)

-- 
+--
Thomas Davis| ASG Cluster guy
[EMAIL PROTECTED] | 
(510) 486-4524  | "80 nodes and chugging Captain!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Looking for ifenslave.c

2001-06-18 Thread Thomas Davis

PALFFY Daniel wrote:
 
 On Thu, 14 Jun 2001, Thomas Davis wrote:
 
  Guus, there isn't a really official version of it..
 
  At http://pdsf.nersc.gov/linux/ifenslave.c is the last version I
  produced, that works with bonding in v2.2 and v2.4 kernels.
 
  Guus Sliepen wrote:
  
   Hello,
  
   The Ethernet bonding module is useless without ifenslave.c. I'm making a Debian
   package for it, and I have tried to find the offical distribution of this
   small program. I could not find an authorative source, instead a lot of copies
   and patched versions are scattered around the Internet (I maintain a patched
   version myself too).
  
   I would like to combine all the useful extra features and patches into this
   Debian package, so if you know of a patched version or maintain one yourself,
   please send it to me.
 
 The only bonding driver and ifenslave that worked for me was the patched
 version from http://sourceforge.net/projects/bonding . It runs fine over a
 quad starfire card, with vlans over it (ben's patch). You might consider
 packaging the ifenslave from that patch, and packaging the bonding driver
 as a kernel patch...
 

Yea, and that ifenslave won't work with redhat's network setup files,
which has been in place for years.  Notice I'm not on that page?  I
considered it a forked version.  It also does things I talked to becker
about, that is not nice to the system (MII polling as often it does is
bad.)

When I created the first 2.2 bonding patch, I didn't want to have to
rewrite redhat's already in place ifenslave support (from the 2.0.xx
kernel patch).  The ifenslave listed on that page is broken in that
regard.

The original ifenslave bonded a device to a master that was already up
and running;  the master device was used also a xmit device  (so it
routed packets, and sometimes transmitted them).  So, if the master
device died, the slave(s) died with it.  Not good.  Redhat config files
assumed the master was up and running, and you can add a slave to it
without any problems.  The slave device also picks up it's mac address
from the master device.

The version I created, the master device does nothing but route packets
to slaves.  This has a simple problem - no known mac hardware address. 
(ie, it's 0:0:0:0:0:0:0:0) That's bad.  To set a hardware mac address,
you need to down, change the hw mac, and re-up the device.  But,
redhat's scripts already assume the master is up and running, and
downing the master to setup the mac hw means all IP routing information
is lost.  So I added the BOND_SETHWADDR, which allows ifenslave to add a
mac address to the bond master without killing any IP routing
information.  But that's not totally correct either, since adding a mac
hw address can screw up the arp tables (it appears not to, but that's
just plain lucky).

So, in summary, bonding is hack, I strongly dis-agree with what is at
http://sourceforge.net/projects/bonding, but my hands are currently tied
on doing much about it (I could, but I could suffer from consequences)

-- 
+--
Thomas Davis| ASG Cluster guy
[EMAIL PROTECTED] | 
(510) 486-4524  | 80 nodes and chugging Captain!
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Looking for ifenslave.c

2001-06-14 Thread Thomas Davis

Guus, there isn't a really official version of it..

At http://pdsf.nersc.gov/linux/ifenslave.c is the last version I
produced, that works with bonding in v2.2 and v2.4 kernels.

Please note; I'm currently bound up in DOE/LBNL contract issues, that
prevent any work on any GPL code on DOE/LBNL time.  Folks, don't flame
us - we know it, we are working on it.  (The problem actually dates back
to the 50's, when the labs where created!)  Once this contract issue is
cleared up, I've been given the 'Ok' to work on it again.

Which means, since I don't have anything at home to work on bonding
with, I can't officially support it.

Sorry.

thomas

Guus Sliepen wrote:
> 
> Hello,
> 
> The Ethernet bonding module is useless without ifenslave.c. I'm making a Debian
> package for it, and I have tried to find the "offical" distribution of this
> small program. I could not find an authorative source, instead a lot of copies
> and patched versions are scattered around the Internet (I maintain a patched
> version myself too).
> 
> I would like to combine all the useful extra features and patches into this
> Debian package, so if you know of a patched version or maintain one yourself,
> please send it to me.
> 
> Thanks,
> 
> --
> Met vriendelijke groet / with kind regards,
>   Guus Sliepen <[EMAIL PROTECTED]>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
+--
Thomas Davis| ASG Cluster guy
[EMAIL PROTECTED] | 
(510) 486-4524  | "80 nodes and chugging Captain!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Looking for ifenslave.c

2001-06-14 Thread Thomas Davis

Guus, there isn't a really official version of it..

At http://pdsf.nersc.gov/linux/ifenslave.c is the last version I
produced, that works with bonding in v2.2 and v2.4 kernels.

Please note; I'm currently bound up in DOE/LBNL contract issues, that
prevent any work on any GPL code on DOE/LBNL time.  Folks, don't flame
us - we know it, we are working on it.  (The problem actually dates back
to the 50's, when the labs where created!)  Once this contract issue is
cleared up, I've been given the 'Ok' to work on it again.

Which means, since I don't have anything at home to work on bonding
with, I can't officially support it.

Sorry.

thomas

Guus Sliepen wrote:
 
 Hello,
 
 The Ethernet bonding module is useless without ifenslave.c. I'm making a Debian
 package for it, and I have tried to find the offical distribution of this
 small program. I could not find an authorative source, instead a lot of copies
 and patched versions are scattered around the Internet (I maintain a patched
 version myself too).
 
 I would like to combine all the useful extra features and patches into this
 Debian package, so if you know of a patched version or maintain one yourself,
 please send it to me.
 
 Thanks,
 
 --
 Met vriendelijke groet / with kind regards,
   Guus Sliepen [EMAIL PROTECTED]
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
+--
Thomas Davis| ASG Cluster guy
[EMAIL PROTECTED] | 
(510) 486-4524  | 80 nodes and chugging Captain!
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Questions about Enterprise Storage with Linux

2001-03-08 Thread Thomas Davis

Tom Sightler wrote:
> 
> Hi All,
> 
> I'm seeking information in regards to a large Linux implementation we are
> planning.  We have been evaluating many storage options and I've come up
> with some questions that I have been unable to answer as far as Linux
> capabilities in regards to storage.
> 
> We are looking at storage systems that provide approximately 1TB of capacity
> for now and can scale to 10+TB in the future.  We will almost certainly use
> a storage system that provides both fiber channel connectivity as well as
> NFS connectivity.
> 
> The questions that have been asked are as follows (assume 2.4.x kernels):
> 
> 1.  What is the largest block device that linux currently supports?  i.e.
> Can I create a single 1TB volume on my storage device and expect linux to
> see it and be able to format it?
> 

Yes.

[root@pdsfdv10 data]# df .
Filesystem   1k-blocks  Used Available Use% Mounted on
/dev/rza31046274600 889731608 146074448  86% /export/data

> 2.  Does linux have any problems with large (500GB+) NFS exports, how about
> large files over NFS?
> 

No.
[root@pdsflx002 pdsfdv10]# df .
Filesystem   1k-blocks  Used Available Use% Mounted on
pdsfdv10.nersc.gov:/export/data
 1046274600 889731608 146074448  86% /auto/pdsfdv10

(same filesystem, via NFS)

files > 2gb need LFS support in ia32 environments.

> 3.  What filesystem would be best for such large volumes?  We currently use
> reirserfs on our internal system, but they generally have filesystems in the
> 18-30GB ranges and we're talking about potentially 10-20x that.  Should we
> look at JFS/XFS or others?
> 

ext2 works fine, you just have to wait about 3 hrs to FSCK a crashed
filesystem; ext3 also works fine.  Get a 2.2.18, apply the ext3 fs
patches, bang, your done.

reiserfs won't work via NFS, without kernel patches.

-- 
+--
Thomas Davis| ASG Cluster guy
[EMAIL PROTECTED] | 
(510) 486-4524  | "80 nodes and chugging Captain!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Questions about Enterprise Storage with Linux

2001-03-08 Thread Thomas Davis

Tom Sightler wrote:
 
 Hi All,
 
 I'm seeking information in regards to a large Linux implementation we are
 planning.  We have been evaluating many storage options and I've come up
 with some questions that I have been unable to answer as far as Linux
 capabilities in regards to storage.
 
 We are looking at storage systems that provide approximately 1TB of capacity
 for now and can scale to 10+TB in the future.  We will almost certainly use
 a storage system that provides both fiber channel connectivity as well as
 NFS connectivity.
 
 The questions that have been asked are as follows (assume 2.4.x kernels):
 
 1.  What is the largest block device that linux currently supports?  i.e.
 Can I create a single 1TB volume on my storage device and expect linux to
 see it and be able to format it?
 

Yes.

[root@pdsfdv10 data]# df .
Filesystem   1k-blocks  Used Available Use% Mounted on
/dev/rza31046274600 889731608 146074448  86% /export/data

 2.  Does linux have any problems with large (500GB+) NFS exports, how about
 large files over NFS?
 

No.
[root@pdsflx002 pdsfdv10]# df .
Filesystem   1k-blocks  Used Available Use% Mounted on
pdsfdv10.nersc.gov:/export/data
 1046274600 889731608 146074448  86% /auto/pdsfdv10

(same filesystem, via NFS)

files  2gb need LFS support in ia32 environments.

 3.  What filesystem would be best for such large volumes?  We currently use
 reirserfs on our internal system, but they generally have filesystems in the
 18-30GB ranges and we're talking about potentially 10-20x that.  Should we
 look at JFS/XFS or others?
 

ext2 works fine, you just have to wait about 3 hrs to FSCK a crashed
filesystem; ext3 also works fine.  Get a 2.2.18, apply the ext3 fs
patches, bang, your done.

reiserfs won't work via NFS, without kernel patches.

-- 
+--
Thomas Davis| ASG Cluster guy
[EMAIL PROTECTED] | 
(510) 486-4524  | "80 nodes and chugging Captain!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



hard lockup using 2.4.1ac-1, usb, uhci

2001-02-15 Thread Thomas Davis

Hey, just found this one out.

I've got a sony vaio 505tx, running linux-2.4.1-ac1, and I've got all
the good stuff turned.

With APM turned, and using USB uhci-alt driver (all as modules), if you
put the laptop to sleep with any (and I mean *any*) usb devices plugged
in, it will hard lock upon resume.

Only way out is to power cycle the poor thing..

I'm going to update to a newer version of the kernel, and see if the
other uhci driver suffers from this fate..

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



hard lockup using 2.4.1ac-1, usb, uhci

2001-02-15 Thread Thomas Davis

Hey, just found this one out.

I've got a sony vaio 505tx, running linux-2.4.1-ac1, and I've got all
the good stuff turned.

With APM turned, and using USB uhci-alt driver (all as modules), if you
put the laptop to sleep with any (and I mean *any*) usb devices plugged
in, it will hard lock upon resume.

Only way out is to power cycle the poor thing..

I'm going to update to a newer version of the kernel, and see if the
other uhci driver suffers from this fate..

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Interface statistics for Bonding bug in 2.4

2001-01-22 Thread Thomas Davis

The answer is in the source; in v2.4, stats are only collected on sent
packets; in v2.2, stats are not collected at all; they are simply summed
from the interfaces stats.

It's not a bug; it's simply a design decision.

Chris Chabot wrote:
> 
> I recently upgraded my main server to a 2.4 kernel (2.4.1pre9). This
> machine uses 2 3Com 3C905B networkcards, bonded together (using the
> bonding module).
> 
> When doing a 'ifconfig' the bond0 device shows 0 RX packets, and a valid
> # of TX packets. However looking at eth0 / eth1 (the 2 network cards)
> they have the just about the same amount of RX packets, so recieving
> does apear to be balanced over the two interfaces.
> 
> When running this machine on 2.2.16 the interface it does show the
> interface statistics accuratly. I also tested this on a clean 2.4.0
> kernel, and it had the same bug.
> 
> The ifconfig output (note the 0 packets in bond0's RX)
> 
> bond0 Link encap:Ethernet  HWaddr 00:50:DA:B8:33:0F
>   inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
> 
>   UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:3655 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
> 
> eth0  Link encap:Ethernet  HWaddr 00:50:DA:B8:33:0F
>   inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
> 
>   UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>   RX packets:1992 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:1828 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:100
>   Interrupt:11 Base address:0x9800
> 
> eth1  Link encap:Ethernet  HWaddr 00:50:DA:B8:33:0F
>   inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
> 
>   UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>   RX packets:1878 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:1827 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:100
>   Interrupt:10 Base address:0x9400
> 
> Please CC me in any replies since im not subscribed to the kernel list.
> 
> -- Chris

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Interface statistics for Bonding bug in 2.4

2001-01-22 Thread Thomas Davis

The answer is in the source; in v2.4, stats are only collected on sent
packets; in v2.2, stats are not collected at all; they are simply summed
from the interfaces stats.

It's not a bug; it's simply a design decision.

Chris Chabot wrote:
 
 I recently upgraded my main server to a 2.4 kernel (2.4.1pre9). This
 machine uses 2 3Com 3C905B networkcards, bonded together (using the
 bonding module).
 
 When doing a 'ifconfig' the bond0 device shows 0 RX packets, and a valid
 # of TX packets. However looking at eth0 / eth1 (the 2 network cards)
 they have the just about the same amount of RX packets, so recieving
 does apear to be balanced over the two interfaces.
 
 When running this machine on 2.2.16 the interface it does show the
 interface statistics accuratly. I also tested this on a clean 2.4.0
 kernel, and it had the same bug.
 
 The ifconfig output (note the 0 packets in bond0's RX)
 
 bond0 Link encap:Ethernet  HWaddr 00:50:DA:B8:33:0F
   inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
 
   UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
   TX packets:3655 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0 txqueuelen:0
 
 eth0  Link encap:Ethernet  HWaddr 00:50:DA:B8:33:0F
   inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
 
   UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
   RX packets:1992 errors:0 dropped:0 overruns:0 frame:0
   TX packets:1828 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0 txqueuelen:100
   Interrupt:11 Base address:0x9800
 
 eth1  Link encap:Ethernet  HWaddr 00:50:DA:B8:33:0F
   inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
 
   UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
   RX packets:1878 errors:0 dropped:0 overruns:0 frame:0
   TX packets:1827 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0 txqueuelen:100
   Interrupt:10 Base address:0x9400
 
 Please CC me in any replies since im not subscribed to the kernel list.
 
 -- Chris

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Bonding...

2000-12-01 Thread Thomas Davis

Rainer Clasen wrote:
> 
> Ciscos MAC based distribution limits each TCP connection to 100 Mbps.
> 

What's even worse, is Cisco can also *clog* channels with traffic, if
your MAC addresses aren't balanced.  (ie, one line can have all the
traffic, while the other is idle..

-- 
+------
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Bonding...

2000-12-01 Thread Thomas Davis

Rainer Clasen wrote:
 
 Ciscos MAC based distribution limits each TCP connection to 100 Mbps.
 

What's even worse, is Cisco can also *clog* channels with traffic, if
your MAC addresses aren't balanced.  (ie, one line can have all the
traffic, while the other is idle..

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre21

2000-11-10 Thread Thomas Davis

Matti Aarnio wrote:
> Beowulf systems have "bonding" in use for parallel Ethernet
> links in between two machines, however THAT is not EtherChannel
> compatible thing!
> 

Maybe we should adopt's sun naming then, and call it 'Trunking'.

This is the same driver that Beowulf uses, and it is Etherchannel
compatible.

The only part of Etherchannel we don't support is the XOR channel
selection (yuck!) and the automatic configuration of the links (it's a 
MII thing, that's undocumented.)

Leave it as Ethernet Bonding.

-- 
+------
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre21

2000-11-10 Thread Thomas Davis

Matti Aarnio wrote:
 Beowulf systems have "bonding" in use for parallel Ethernet
 links in between two machines, however THAT is not EtherChannel
 compatible thing!
 

Maybe we should adopt's sun naming then, and call it 'Trunking'.

This is the same driver that Beowulf uses, and it is Etherchannel
compatible.

The only part of Etherchannel we don't support is the XOR channel
selection (yuck!) and the automatic configuration of the links (it's a 
MII thing, that's undocumented.)

Leave it as Ethernet Bonding.

-- 
+------
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Tux2 - evil patents sighted

2000-10-03 Thread Thomas Davis

Daniel Phillips wrote:
> 
> "Jeff V. Merkey" wrote:
> > I am having Andrew McCullough review these patents to determine if there
> > are any infringement issues that may affect us.  Whomever is concerned
> > her, if it would not be too much trouble, please forward what
> > documentation and patent no.'s to [EMAIL PROTECTED] and copy me at
> > [EMAIL PROTECTED] and we will forward them to Malinkrodt &
> > Malinkrodt in Salt Lake City.  I'll pay them to do a patent infringment
> > analysis, and post their analysis to interested/affected parties.
> 
> 
>http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO1=HITOFF=PALL=1=/netahtml/srchnum.htm=1=G=50='5819292'.WKU.=PN/5819292=PN/5819292
> 
> 
>http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO1=HITOFF=PALL=1=/netahtml/srchnum.htm=1=G=50='5963962'.WKU.=PN/5963962=PN/5963962
> 
> 
>http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2=HITOFF=/netahtml/search-adv.htm=1=G=50=PALL=1=6038570=6038570=6038570
> 
> I suppose you will need a formal description of my algorithm.
> 

You probably also want to add
http://www.patents.ibm.com/details?pn=US06049528__ for the bonding
driver..  Since it's already in the kernel, and prior work can be
demonstrated also.

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH-2.2] Bonding Driver Enhancements + Security fix

2000-10-03 Thread Thomas Davis

willy tarreau wrote:
> 
> > rename bond_xmit to bond_xmit_roundrobin, so
> > bond_xmit_xor can be implemented, and used if
> > desired.  bond_xmit_xor is what cisco
> > etherchannel/sun trunking really uses, not round
> > robin.
> 
> how does their xor method work ? do you know about an
> RFC stating about this, that I could read ? I'm
> really interested in this since I must propose a
> completely redondant switch/server solution for a big
> project here. The more I will know about their trunk,
> the best I may be able to do :-)

See this:

http://docs.sun.com:80/ab2/coll.539.1/UGTRUNKING/@Ab2PageView/1311?Ab2Lang=C=iso-8859-1

for information on the 4 possible trunking transmittors.

-- 
----+------
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Tux2 - evil patents sighted

2000-10-03 Thread Thomas Davis

Ion Badulescu wrote:
> 
> In article <[EMAIL PROTECTED]> Daniel Phillips wrote:
> 
> > It is important that all technology used in GPL software be free of
> > patent restrictions.
> 
> Indeed.
> 
> For another fine example of GPL technology covered by a parent, check out:
> 
> http://www.patents.ibm.com/details?pn=US06049528__
> 
> This a patent filed by Sun in June 1997 and awarded in April 2000 which
> covers very well the ethernet bonding device in Linux 2.2.x.
> 
> I wonder if the equalizer device present in Linux kernels since before
> 1996 could count as prior art. IANAL, of course.
> 

Or, even better, the fact that Ethernet bonding has been available as a
Linux patch since about 1995..

I'm sure Donald Becker could produce prior art on that one!

-- 
--------+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH-2.2] Bonding Driver Enhancements + Security fix

2000-10-03 Thread Thomas Davis

willy tarreau wrote:
 
  rename bond_xmit to bond_xmit_roundrobin, so
  bond_xmit_xor can be implemented, and used if
  desired.  bond_xmit_xor is what cisco
  etherchannel/sun trunking really uses, not round
  robin.
 
 how does their xor method work ? do you know about an
 RFC stating about this, that I could read ? I'm
 really interested in this since I must propose a
 completely redondant switch/server solution for a big
 project here. The more I will know about their trunk,
 the best I may be able to do :-)

See this:

http://docs.sun.com:80/ab2/coll.539.1/UGTRUNKING/@Ab2PageView/1311?Ab2Lang=CAb2Enc=iso-8859-1

for information on the 4 possible trunking transmittors.

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Tux2 - evil patents sighted

2000-10-03 Thread Thomas Davis

Daniel Phillips wrote:
 
 "Jeff V. Merkey" wrote:
  I am having Andrew McCullough review these patents to determine if there
  are any infringement issues that may affect us.  Whomever is concerned
  her, if it would not be too much trouble, please forward what
  documentation and patent no.'s to [EMAIL PROTECTED] and copy me at
  [EMAIL PROTECTED] and we will forward them to Malinkrodt 
  Malinkrodt in Salt Lake City.  I'll pay them to do a patent infringment
  analysis, and post their analysis to interested/affected parties.
 
 
http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO1Sect2=HITOFFd=PALLp=1u=/netahtml/srchnum.htmr=1f=Gl=50s1='5819292'.WKU.OS=PN/5819292RS=PN/5819292
 
 
http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO1Sect2=HITOFFd=PALLp=1u=/netahtml/srchnum.htmr=1f=Gl=50s1='5963962'.WKU.OS=PN/5963962RS=PN/5963962
 
 
http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netahtml/search-adv.htmr=1f=Gl=50d=PALLp=1S1=6038570OS=6038570RS=6038570
 
 I suppose you will need a formal description of my algorithm.
 

You probably also want to add
http://www.patents.ibm.com/details?pn=US06049528__ for the bonding
driver..  Since it's already in the kernel, and prior work can be
demonstrated also.

-- 
+------
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH-2.2] Bonding Driver Enhancements + Security fix

2000-10-02 Thread Thomas Davis

Willy TARREAU wrote:
> 
> Hello Thomas !
> 
> I've slightly enhanced the bonding code :
>   - MII link checking with automatic slave enabling/disabling :
> Now the bond interface monitors all its MII-compliant slaves
> and disables the ones which have a dead link, and enables those
> which have a good one. The link check time defaults to 1 second
> but I've seen no overhead even at 30 ms.
> 
>   - slave release is now possible with a running bond
>   - SMP-safe enslave/release/check/stats/xmit
>   - fix a security bug which allowed anybody to enslave any active interface,
> thus making a local denial of service.
>   - fix a potential infinite loop in bond_xmit() if no slave is useable.
> 
> It now works very well for me, and the removal of a link becomes
> completely transparent now. On monday, I'll trunk it to an alteon switch.
> 
> I've stressed the enslave/release code during "ping -f" and links up/down,
> but triggered absolutely no problem. I think it's stable enough to include it
> in 2.2.18 (Alan CC'ed for this). I'd like Constantine to test it on his servers
> because it should do exactly what he needed, and send us his feedback.
> 

Ok, I have several things, since work is being done on this..

rename bond_xmit to bond_xmit_roundrobin, so bond_xmit_xor can be
implemented, and used if desired.  bond_xmit_xor is what cisco
etherchannel/sun trunking really uses, not round robin.

Remove the variable counters from the xmit loop..  Make it more like:
(from the 2.4 bonding driver)

start = slave = bond

do {
if (slave == bond)
continue;
if (blah..blah)
do xmit()
return 0;
}
} while ((slave=slave->next) != start_at);
kfree(skb);
return 0;

This simplifies the transmit path; and bonding is cpu intensive already!

I'd also like to get the help included.. see attached patch for that!

-- 
----+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"

diff -ruN linux-2.2.17-RAID/Documentation/Configure.help 
linux/Documentation/Configure.help
--- linux-2.2.17-RAID/Documentation/Configure.help  Tue Sep 19 17:18:27 2000
+++ linux/Documentation/Configure.help  Mon Oct  2 13:32:41 2000
@@ -4901,6 +4901,28 @@
   time, you need to compile this driver as a module. Instead of
   'dummy', the devices will then be called 'dummy0', 'dummy1' etc.
 
+Bonding driver support
+CONFIG_BONDING
+  Say 'Y' or 'M' if you wish to be able to 'bond' multiple Ethernet
+  Channels together. This is called 'Etherchannel' by Cisco,
+  'Trunking' by Sun, and 'Bonding' in Linux.
+
+  If you have two ethernet connections to some other computer, you can
+  make them behave like one double speed connection using this driver.
+  Naturally, this has to be supported at the other end as well, either
+  with a similar Bonding Linux driver, a Cisco 5500 switch or a
+  SunTrunking SunSoft driver.
+
+  This is similar to the EQL driver, but it merges Ethernet segments
+  instead of serial lines.
+
+  For more information, please see Documentation/networking/bonding.txt.
+
+  If you want to compile this as a module ( = code which can be
+  inserted in and removed from the running kernel whenever you want),
+  say M here and read Documentation/modules.txt. The module will be
+  called bonding.o.
+
 SLIP (serial line) support
 CONFIG_SLIP
   Say Y if you intend to use SLIP or CSLIP (compressed SLIP) to



Re: [PATCH-2.2] Bonding Driver Enhancements + Security fix

2000-10-02 Thread Thomas Davis

Willy TARREAU wrote:
 
 Hello Thomas !
 
 I've slightly enhanced the bonding code :
   - MII link checking with automatic slave enabling/disabling :
 Now the bond interface monitors all its MII-compliant slaves
 and disables the ones which have a dead link, and enables those
 which have a good one. The link check time defaults to 1 second
 but I've seen no overhead even at 30 ms.
 
   - slave release is now possible with a running bond
   - SMP-safe enslave/release/check/stats/xmit
   - fix a security bug which allowed anybody to enslave any active interface,
 thus making a local denial of service.
   - fix a potential infinite loop in bond_xmit() if no slave is useable.
 
 It now works very well for me, and the removal of a link becomes
 completely transparent now. On monday, I'll trunk it to an alteon switch.
 
 I've stressed the enslave/release code during "ping -f" and links up/down,
 but triggered absolutely no problem. I think it's stable enough to include it
 in 2.2.18 (Alan CC'ed for this). I'd like Constantine to test it on his servers
 because it should do exactly what he needed, and send us his feedback.
 

Ok, I have several things, since work is being done on this..

rename bond_xmit to bond_xmit_roundrobin, so bond_xmit_xor can be
implemented, and used if desired.  bond_xmit_xor is what cisco
etherchannel/sun trunking really uses, not round robin.

Remove the variable counters from the xmit loop..  Make it more like:
(from the 2.4 bonding driver)

start = slave = bond

do {
if (slave == bond)
continue;
if (blah..blah)
do xmit()
return 0;
}
} while ((slave=slave-next) != start_at);
kfree(skb);
return 0;

This simplifies the transmit path; and bonding is cpu intensive already!

I'd also like to get the help included.. see attached patch for that!

-- 
+------
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"

diff -ruN linux-2.2.17-RAID/Documentation/Configure.help 
linux/Documentation/Configure.help
--- linux-2.2.17-RAID/Documentation/Configure.help  Tue Sep 19 17:18:27 2000
+++ linux/Documentation/Configure.help  Mon Oct  2 13:32:41 2000
@@ -4901,6 +4901,28 @@
   time, you need to compile this driver as a module. Instead of
   'dummy', the devices will then be called 'dummy0', 'dummy1' etc.
 
+Bonding driver support
+CONFIG_BONDING
+  Say 'Y' or 'M' if you wish to be able to 'bond' multiple Ethernet
+  Channels together. This is called 'Etherchannel' by Cisco,
+  'Trunking' by Sun, and 'Bonding' in Linux.
+
+  If you have two ethernet connections to some other computer, you can
+  make them behave like one double speed connection using this driver.
+  Naturally, this has to be supported at the other end as well, either
+  with a similar Bonding Linux driver, a Cisco 5500 switch or a
+  SunTrunking SunSoft driver.
+
+  This is similar to the EQL driver, but it merges Ethernet segments
+  instead of serial lines.
+
+  For more information, please see Documentation/networking/bonding.txt.
+
+  If you want to compile this as a module ( = code which can be
+  inserted in and removed from the running kernel whenever you want),
+  say M here and read Documentation/modules.txt. The module will be
+  called bonding.o.
+
 SLIP (serial line) support
 CONFIG_SLIP
   Say Y if you intend to use SLIP or CSLIP (compressed SLIP) to



Re: Bonding Driver Questions

2000-09-25 Thread Thomas Davis

Constantine Gavrilov wrote:
> 
> 1) How can I check for the link status from the user space?
> 2) Could enslaved interface be released without bringing the master
> interface down? If yes, how? Could we have ifunslave?
> 

Link status is not used at all in v2.2  (and would mean a rewrite of
drivers to get it)

Link status is used in v2.4.  Not all drivers support link status.  In
fact, I don't know of any that do - but it's possible now to do it.

Simply taking down the interface should be enough to remove it from
enslavement.

-- 
+------
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Bonding Driver Questions

2000-09-25 Thread Thomas Davis

Constantine Gavrilov wrote:
 
 1) How can I check for the link status from the user space?
 2) Could enslaved interface be released without bringing the master
 interface down? If yes, how? Could we have ifunslave?
 

Link status is not used at all in v2.2  (and would mean a rewrite of
drivers to get it)

Link status is used in v2.4.  Not all drivers support link status.  In
fact, I don't know of any that do - but it's possible now to do it.

Simply taking down the interface should be enough to remove it from
enslavement.

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



scsi change, two tape drives, kernel oops in st..

2000-09-20 Thread Thomas Davis

Ok, I've got an ADIC scsi tape library, that is using the scsi Changer
code from 
http://www.strusel007.de/linux/changer.html (v0.15) with linux 2.2.17.

When you do:

mt -f /dev/nst0 offline
mover ex d0 s12
mt -f /dev/nst1 offline
mover ex d1 s13

you get the attached kernel oops.  I've included the ksymoops output,
along with the dmesg results (which includes the SCSI tape library
information)

You will continue to get oops from the second tape drive from now on.

Any ideas on what's wrong?  I'll be looking at the st code, hoping
that's it trivial to fix..

-- 
+--
Thomas Davis| PDSF Project Leader
[EMAIL PROTECTED] | 
(510) 486-4524  | "Only a petabyte of data this year?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/