Re: [osol-discuss] worrying hangs

2009-07-12 Thread Matt Harrison

Che Kristo wrote:
If you are concerned about the availability maybe move it to at least a 
stable opensolaris release, running of SXCE in production seems to be 
asking for trouble


Thanks for the push, I forced myself to make time and reinstalled the 
machine with 2009.06. I managed to get the pool back, setup CIFS and get 
everything going again in a couple of hours.


So far I haven't managed to make the machine hang again, but I'll keep 
trying. I guess it could have been a buggy driver or something that 
didn't like my disks/controller.


Fingers crossed it's all over.

Thanks

Matt
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] worrying hangs

2009-07-12 Thread Che Kristo
Good to hear :) lets hope it wasnt hardware related.

On Mon, Jul 13, 2009 at 5:44 AM, Matt Harrison iwasinnamuk...@genestate.com
 wrote:

 Che Kristo wrote:

 If you are concerned about the availability maybe move it to at least a
 stable opensolaris release, running of SXCE in production seems to be asking
 for trouble


 Thanks for the push, I forced myself to make time and reinstalled the
 machine with 2009.06. I managed to get the pool back, setup CIFS and get
 everything going again in a couple of hours.

 So far I haven't managed to make the machine hang again, but I'll keep
 trying. I guess it could have been a buggy driver or something that didn't
 like my disks/controller.

 Fingers crossed it's all over.

 Thanks

 Matt

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Re: [osol-discuss] worrying hangs

2009-07-11 Thread Che Kristo
If you are concerned about the availability maybe move it to at least a
stable opensolaris release, running of SXCE in production seems to be asking
for trouble

On Sun, Jul 12, 2009 at 12:22 AM, Matt Harrison 
iwasinnamuk...@genestate.com wrote:

 Matt Harrison wrote:

 Hi all,

 We've got a filer built on consumer hardware running SXCE snv_97, holding
 a small (1.4TB) raidz array. It's been going great for the last 6 months or
 so, but recently its started misbehaving.

 We use in-kernel CIFS for most of our needs and it works perfectly when
 playing media or mounting backed-up CD images. The problem comes when we try
 to explicitly copy something from it.

 When you actually try a direct copy via CIFS, HTTP, SSH or FTP, the
 transfer has about a 70% chance it will hang the machine. The larger the
 file, the more probable it is.

 I have no idea how to start investigating this as the network is
 inaccessible, the console is frozen and there are no hints left behind in
 the logs.

 My only way to recover the server is to shutdown (with the soft-off button
 on the case) and bootup. I can tell the machine isn't totally hung as it
 will apparently do a proper shutdown procedure.

 We're really out of ideas and worried that there could be a problem with
 our raidz array, even though there are no errors logged concerning it. As
 far as we can tell, there is no problem with any data (yet), just the system
 itself.

 Any ideas how to go about investigating this further?


 I hate to pester but I'm surprised no-one has any ideas on this. We are
 constantly worried about what the side-effects of the server hanging might
 be, and of course it is decreasing the availability of our data.

 Thanks

 Matt
 ___
 opensolaris-discuss mailing list
 opensolaris-discuss@opensolaris.org

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Re: [osol-discuss] worrying hangs

2009-07-11 Thread Matt Harrison

Che Kristo wrote:
If you are concerned about the availability maybe move it to at least a 
stable opensolaris release, running of SXCE in production seems to be 
asking for trouble


Thanks for the reply. I am aware that SXCE isn't the best release. I've 
been planning a move to 2009.06, but I thought if this is a hardware 
problem, then I should fix that first.


I'll try exporting the data pool, installing 2009.06 onto the root disk 
and re-importing. I understand this should preserve the data perfectly. 
Of course, that should provide a more stable platform and I'll report 
back if I'm still having the hangs.


Thanks
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org