Re: [zones-discuss] networking

2010-02-17 Thread Ellard Roush

Hi,

The clprivnet0 interface is something provided by Solaris Cluster.
The clprivnet software is a kind of highly available trunking driver
that communicates across all of the private networks that connect
the machines of the cluster. This set of networks is often
called the private interconnect.
When there are Zone Clusters, the software automatically sets
a subnet and IP addresses for each Zone Cluster.
I believe that the cluster software does so as well for the
Global Cluster.
The administrator specifies a set of subnets and IP addresses
for this purpose when configuring the cluster.
If you have further questions related to clprivnet,
I would suggest sending those questions to sunclus...@sun.com

Regards,
Ellard

On 02/16/10 14:19, Enda O'Connor wrote:

Hi
Are you sure cluster is disabled, what does /usr/cluster/bin/status show?

Enda

On 16/02/2010 21:59, Dombrowski, Neil wrote:

-Original Message-
From: sowmini.varad...@sun.com [mailto:sowmini.varad...@sun.com]
Sent: Tuesday, February 16, 2010 1:16 PM
To: Dombrowski, Neil
Cc: zones-discuss@opensolaris.org
Subject: Re: [zones-discuss] networking

On (02/16/10 19:03), Dombrowski, Neil wrote:


I'm new to zones, and this appears to be a conundrum for me: I have a
global zone that shows multiple default routes (on different
interfaces). It also shows a third separate interface (clprivnet0) with
an IP that's not in anyone's documentation(actually there are two
physical servers set up the same way). My guess is that these two
servers were to be clustered at one point, but this was aborted before
I came onboard. Regardless, the global zone's routing table looks busy,
is it because it's showing the routes for the zones? If so, is it
possible to have the global zone routing differently than the local
zones?


hard to answer, without more data on what the subnets for the various
zones are, and what the desired routing is. The global zone's netstat
may show routes that are only accessible from a non-global zone, so the
fact that the routing table is busy does not say anything without
more information about the subnet configuration.

--Sowmini

For an example, let's say zone1 has a default route using gateway 
172.16.1.1 and zone2 has a default router using gateway 192.168.0.1. 
If I am logged into the global zone, and it needs to send a packet to 
10.10.10.10, will it use one of the non-global-zone's default route? 
Looking at /etc/defaultrouter for the global zone, it shows the 
gateway IPs for the two non-global zones, and also 10.10.10.1 .  when 
I try to traceroute to 10.10.10.10 it never shows a single hop (as if 
it's not going to any gateway).


So, why am I not getting to 10.10.10.10? And if I removed the other 
default routes in the global zone, will I be damaging the routing for 
the local zones? If I add a static route in the global zone will that 
be propagated to the non-local zones(I wouldn't want that)? If there's 
a good doc out there that explains this, I'd appreciate a pointer to 
it, or whatever advice you have for me.


Thanks,
  Neil

___
zones-discuss mailing list
zones-discuss@opensolaris.org


___
zones-discuss mailing list
zones-discuss@opensolaris.org

___
zones-discuss mailing list
zones-discuss@opensolaris.org


[zones-discuss] [Fwd: RAC on Zone Clusters BluePrint]

2009-05-13 Thread Ellard Roush

FromEllard
Subject RAC on Zone Clusters BluePrint

Recently, there were a number of queries for a current document
on how to run RAC in non-global zones. We have just published
this new document.

Gia-Khanh and I  published a small book in the BluePrint series
that explains the following:

  1) A very brief overview of Zone Clusters
  2) An overview of how Sun Cluster supports RAC
  3) A detailed example of how we configured one system
 to support RAC running in Zone Clusters
  4) For RAC 9i/10g/11g on supported storage topologies
 we provide an outline of the steps
 needed to deploy RAC and show the application management
 configuration (RGM Resource Group / Resource / Dependencies / Affinities).

The goal is to explain the use of Zone Clusters to support RAC.

 Deploying Oracle Real Application Clusters (RAC) on Solaris Zone Clusters

The Wiki page url is:

http://wikis.sun.com/display/BluePrints/Deploying+Oracle+Real+Application+Clusters+(RAC)+on+Solaris+Zone+Clusters

The sun.com mirror page url is:

http://www.sun.com/offers/details/820-7661.xml

---
This document is intended for public use.

Please contact me if you have any questions.
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S10 brand spec.

2009-05-13 Thread Ellard Roush

Hi Jerry,

This document provides a lot of useful information.

The section solaris10 Brand: What's Not Emulated
you repeat some old information that is no longer correct.

 One point to note is that TX will continue to
  be incompatible with branded zones.

That statement probably dates to the time when the
lx brand was the only branded zone other than native.
Solaris Trusted Extensions (TX) does not support lx
and so it was correct at that time to state that TX does
not support branded zones.

The BrandZ framework now supports multiple kinds of zones,
including the native brand zone.
The BrandZ framework provides a powerful mechanism for
tailoring the behavior of a zone.

The Sun Cluster organization has taken the native brand
zone and used the BrandZ framework to add callbacks for
notifying Sun Cluster software about various zone changes.
This cluster brand zone is a branded zone, but is
really a native zone with cluster hooks. Our goal
is make this cluster brand zone behave as much as
possible just like the native brand zone.

We have recently been successful in getting
Sun Cluster to work with TX using the cluster
brand zone. The Zone Cluster provides a cluster-wide
security container.   :)
So please revise your statement
about TX not being able to work with zones other than
the native zone.

Please note that this is a very
recent development. The Sun Cluster organization has
not announced support for this as a product offering.
Here is the usual disclaimer that an engineering
milestone is not the same as a product announcement.

In the past, there have been bugs where code was written
that assumed that only a native zone supported various
options. Please assume that both native and cluster
brand zones need the same support.

The Sun Cluster organization is considering supporting
Zone Clusters composed of non-global zones based upon
the solaris10 container for Solaris.Next.
This will probably require a cluster+solaris10 composite
brand zone. We recognize that the Sun Cluster organization
would have to create this composite zone. However,
we do request that the design for the solaris10 zone
make it possible for the Sun Cluster organization to
create such a composite zone by reusing all or almost all solaris10
brand features. We would also be interested in reusing the p2v and v2v
features.

TX on Solaris.Next will use a branded zone (according to
the latest information that I have heard). The Sun Cluster organization
will be interested in supporting a composite cluster+TX brand zone.

The net result is that we recommend that the Solaris Zones team
consider how zone features would be reused by the previously
mentioned composite branded zones.

Regards,
Ellard

On 05/12/09 04:28, Jerry Jelinek wrote:

Enclosed is a first draft of a spec. for the S10
brand which we plan to submit for a PSARC
inception review.  Please send us any comments
or questions.

Thanks,
Jerry

---

   S10C: A Solaris 10 Branded Zone for Solaris.Next

 Gerald Jelinek, Jordan Vaughan
   Solaris Virtualization Technologies


[A note on terminology: This document uses the terms Solaris 10 and
 Solaris.Next very frequently.  As such, the abbreviations S10 and
  S.next respectively are used interchangeably with the longer forms.
  The term virtualization is abbreviated as V12N.]


Part 1: Introduction


Each new minor release of Solaris brings with it the well known problems
of slow user adoption, slow ISV support and concerns about compatibility.
The compatibility concerns will be more pronounced with the release of
S.next since it's anticipated that there will be greater than normal
user-visible changes (e.g. the packaging system, etc.).

Fortunately, since the last minor release of Solaris (Solaris 10), V12N
techniques have become widespread and V12N can be used as a solution to
ease the transition to the new version of Solaris.  Zones[1] combined
with a brand[2] are particularly well suited for this task since the host
system is actually running S.next, whereas this is not necessarily the
case with other V12N solutions.  In addition, zones are usable on any
system which runs S.next, which is also not the case with other V12N
alternatives.

We already have a proven track record delivering this sort of
zones/brand based solution to enable running earlier versions of Solaris
on S10 [3, 4], so in one sense this case breaks little new ground.
However, the earlier 'solaris8' and 'solaris9' brands were used to host
releases that are very static as compared to hosting a zone running S10.
In addition, S.next can be expected to continue to change rapidly for
the forseeable future.  Given this, a 'solaris10' brand for S.next poses
additional challenges for projects on both the S10 and S.next sides of
the system.  Many of these challenges are outside of the scope of an
architectural review and include developer education, testing and

[zones-discuss] [Fwd: df -h in zone cluster]

2009-04-30 Thread Ellard Roush

Hi,

The question raised by Sunil seems to be a zones question.
Does anyone have an explanation or is this a bug ?

Regards,
Ellard

 Original Message 
Subject: df -h in zone cluster
Date: Thu, 30 Apr 2009 10:28:38 -0400
From: Sunil Sohani sunil.soh...@sun.com
To: sunclus...@sun.com

Hi,

IHAC running zone clusters. They are are running some monitoring software 
within zone cluster which does df -h to monitor disk space.


Here is sample output:

# df -h
Filesystem size   used  avail capacity  Mounted on
/0K   8.1G13G39%/
/dev21G   8.1G13G39%/dev
/lib31G   9.6G21G32%/lib
/platform   31G   9.6G21G32%/platform
/sbin   31G   9.6G21G32%/sbin
/usr31G   9.6G21G32%/usr
/usr/local  31G   9.6G21G32%/usr/local
proc 0K 0K 0K 0%/proc
ctfs 0K 0K 0K 0%/system/contract
mnttab   0K 0K 0K 0%/etc/mnttab
objfs0K 0K 0K 0%/system/object
swap17G   376K17G 1%/etc/svc/volatile
fd   0K 0K 0K 0%/dev/fd
swap17G64K17G 1%/tmp
swap17G56K17G 1%/var/run
/sbin   31G   9.6G21G32%/var/cluster/sbin.org
/usr/cluster/lib/sc/ifconfig_client_proxy
31G   9.6G21G32%/sbin/ifconfig
zdtgdbq01/odb0K29K86G 1%/odb
zdtgdbq01/odb/dtg02/flashdata01
 0K   5.1G   4.9G52%/odb/dtg02/flashdata01
zdtgdbq01/odb/dtg02/oraarch
 0K   525M   9.5G 6%/odb/dtg02/oraarch
zdtgdbq01/odb/dtg02/orabackup
 0K14M  10.0G 1%/odb/dtg02/orabackup
zdtgdbq01/odb/dtg02/orabin
 0K   2.2G   7.8G22%/odb/dtg02/orabin
zdtgdbq01/odb/dtg02/oradata01
 0K   1.2G   8.8G12%/odb/dtg02/oradata01
zdtgdbq01/odb/dtg02/oradata02
 0K   1.3G   8.7G13%/odb/dtg02/oradata02
zdtgdbq01/odb/oem01/orabin
 0K   1.9G   8.1G19%/odb/oem01/orabin

zdtgdbq01 is a ZFS pool used for Oracle database that has been added to this 
zone cluster. Monitoring software looks at size column and sees 0K and 
starts sending alerts.


Is there a solution for this? Or is that how it is going to be and they need to 
change the way they monitor it?


Customer thinks they haven't configured the resource group properly.

Sunil

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Patching clustered zones

2009-01-22 Thread Ellard Roush
Hi Geoff,

As an introduction I work on the Sun Cluster development team.

Questions about how to support the existing Sun Cluster product
can be sent to

sunclus...@sun.com

There are people that answer questions about the existing product.

Sun Cluster today supports non-global zone in two ways:

1) HA-Containers - this approach makes it look like the
non-global zone can fail over between machines. However,
this approach is NOT based upon detach a zone from one machine
and then attach the zone to another machine.

2) Another approach treats a non-global zone as a place where
applications can be started/halted under sun cluster control.
However, this approach does not provide isolation.

Sun Cluster will very, very, very soon ship SC3.2 update 2,
which introduces a new zone feature:

Zone Cluster - this is a virtual cluster where each virtual node
is a non-global zone. The application inside the Zone Cluster sees
the Zone Cluster as a dedicated private cluster. This feature
provides application fault isolation, security isolation,
resource management, and license fee cost containment. We provide
many ease-of-use features. I would be happy to provide details on
this new feature.



Sun Cluster supports the use of ZFS as the root file system with SC3.2u2.


Sun Cluster supports several approaches for changing software.

1. Halt all nodes. Install new software on each node. Boot all nodes.

2. Rolling Upgrade - halt one node at a time.
While in non-cluster mode, install new software on that
node. Reboot that node into cluster mode. Repeat process for each node
until done. A portion of the cluster remains up at all times.

3. Quantum Leap - Halt half the cluster and install new software
in non-cluster mode. The Quantum Leap then does a very quick handoff
of services from the partition with old software to the partition with
new software. Next upgrade this second partition.
Reboot the second partition in cluster mode, and the full cluster reforms.

We also support Live Upgrade.

So it is possible to change software with minimal down time.
All of these approaches can be used with patches, as well as update releases.

Approaches 1  3 always run a cluster with all nodes at the same release level.
Rolling Upgrade supports the situation where the different nodes
are at different OS  SC patch levels
(but does nothing for application patches).

Quantum Leap can be used to upgrade:
   OS,
   Sun Cluster,
   3rd party File System or Volume Manager,
   application software,
and any other software that you can put on a cluster.

At this point Sun Cluster does not have any need patch on attach function.
If you still believe that there is an important upgrade scenario,
that Sun Cluster does not support, please let me know.

I am the Technical Lead for the infrastructure area, which includes
both Zone Clusters and upgrade technology.

Regards,
Ellard Roush

On 01/22/09 11:40, Geoff Lane wrote:
 We are in the process of setting up a service consisting of SAN based global
 storage which will host a number of ZFS based zones, each running an
 application that must be nade highly avalable.  The zones are made highly
 available using Solaris Cluster and failover.  This is all rather standard 
 and is
 described in the Sun zone/cluster docs. For technical reasons all the zones
 will be full root zones.
 
 However, we are having trouble finding a safe patching procedure that
 minimises application downtime.  The zone docs tell us that a zone should be
 maintained as the same patch level as the global zone, but in a cluster this
 seems impossible without a total break in service.  At some point the app
 zones will be running on one of the global zones forming the cluster but
 with mismatched patches.
 
 The patch on attach facility will be nice but is it going to work in a
 cluster.  Will the cluster notice that the zone is very slow booting up and
 treat it as a failure? Even if cluster doesn't care, the apps are still down
 while the zone is being patched.
 
 Is there a soution that I've missed?
 
 Thanks,
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] documentation for zones

2008-11-24 Thread roush
Hi,

Please see comments inline.

Ellard

Edward Pilatowicz wrote:
 On Fri, Nov 21, 2008 at 01:02:14PM +0100, Maciej Browarski wrote:

 Jerry Jelinek pisze:
 Maciej Browarski wrote:
 Hello,
 Is there any consolidate documentation about build of config.xml and
 platform.xml files ?
 Because information about content are in many documents, but I can't
 find exactly what options are correct and possible in this two files.
 There is no docs because those are project private
 interfaces.  I assume you are trying to create
 your own brand?  Perhaps you can tell us more about
 what you are trying to do.

 Thanks,
 Jerry
 Yes,I try to understand, how Zones works, and how we can configure it. :)
 
 well, you shouldn't be modifying any of the parameters in platform.xml,
 config.xml, or any of the zone xml files.
 
 once again, what are you trying to do?
 
 So I have below question:
 - what are different between privileges set default, prohibited and
 required in config.xml ?
 
 well, the default privs are the privs that all zones get.
 the prohibited privs are ones that can't be added to zones by zonecfg.
 the requires pivs are ones that can't be removed from zones by zonecfg.
 
  - is this privileges are only information for zoneadm how to configure
 zones or have any impact to create and running zones?(so is this list of
 privileges also are hard coded in kernel and config.xml only inform
 about privileges ?)
 
 zone privs are not hardcoded into the kernle.
 
 - if I change only brand name in config.xml I see this name later in
 zoneadm list -iv, so is this has only impact to zoneadm list or also in
 kernel performance ? (to be more clear, is there any native brand hard
 coded in kernel, that native zone is more privileges and faster than
 other names and brand? what exactly information are carry in struct
 brand p_brand and p_brand_data in proc_t structure).
 
 you will break things if you randomly change zone brand names.
 
 there is special handling for the native brand in the kernel.
 if a zone is of type native, the kernel doesn't invoke any of
 the optional brandz interposition callbacks.  that said, i don't
 think you'd be able to see any observable performance differences.
 
Please note that there is a cluster Brand zone.
 From the perspective of packaging/patching/updating
the cluster Brand zone is identical to the native brand.
If the native Brand zone ever gets any other kind of special
treatment, the cluster Brand zone will need the same treatment.
The cluster Brand zone is really a native Brand zone with
cluster hooks.

 the p_brand and p_brand_data structures are used to keep track
 of process brand specific data.
 
 - which options determinate that packages are also installed/updated
 from global zone (so if I like to have old packages, not updated in
 zones but without -G options). I aware that  I can break depend
 between  packages.
 
 the packaging tools ignore all non-native (ie, branded) zones.
 there is no brand flag that tells the packaging system to keep
 a branded zone in sync with the global zone.
 
No. The cluster brand zone is treated just like a native brand zone
by the packaging tools. We have a PSARC contract on this point.

The Solaris software should NEVER assume that a brand zone is Always different
from the native brand zone type. The cluster brand zone needs the
same support as the native brand zone.

 - if I clear attach and detach options packages, will be not checked in
 zoneadm attach and attached will be successful ?

 
 i don't really understand this question.
 
 ed
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Questions regarding Solaris containers

2008-10-27 Thread Ellard Roush
Hi,


The Sun Cluster Express release has already shipped a new feature
called Zone Clusters. This is a Virtual Cluster where the virtual
node is a zone. A major reason for developing this feature was to
provide the ability to run Oracle RAC in zones. Oracle RAC
requires a cluster environment, in other words Oracle RAC requires
multiple machines. The Zone Cluster satisfies the needs of Oracle RAC.
We have successfully run RAC 9i, 10g,  11g on the same hardware
at the same time in different zone clusters. The Sun Cluster Marketing
organization always announces new features. As an engineer I cannot
formally announce new support. However, I would be happy to provide
more information for you, and can even demonstrate this feature
in actual operation for interested people.
The next product release of Sun Cluster will be SC3.2 update 2
early in 2009. I am most optimistic about supporting RAC with
Zone Clusters soon.

If anybody wants more detailed information, please contact me.

Regards,
Ellard

...
 8.  What databases are supported today for Solaris containers? As 
 per the bigadmin document “db_in_containers”, only non-RAC Oracle is 
 supported by containers. Is this still valid today or is there support 
 provided for Oracle RAC?
 
 Oracle is supported. I understand that RAC support may be coming.
 
 Is DB2 supported inside containers?
 
 I don't know.
 
 Steffen
 
 Thanks in advance.
  
 Regards,
 -Narsimha
  
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Running Oracle Database inside Solaris 8/9 Container Using Sun Cluster

2008-10-20 Thread roush
Hi Dr. Hung-Sheng Tsao,

Sun Cluster supports today single machine Oracle Data Base running
inside a Solaris Container (also called a non-global zone),
where the zone is a native brand zone.

In the latest Sun Cluster Express release, Sun Cluster supports
a new feature called a Zone Cluster, which is a Virtual Cluster
where the virtual nodes are all non-global zones. We have run
Oracle RAC 9i, 10g, and 11g concurrently inside different
Zone Clusters on the same set of hardware. Our
Marketing team formally announces new features in the commercial
product. If you are interested in running Oracle RAC in
zones, please contact me and I will provide details on how
we will soon be supporting the ability to run Oracle RAC in zones.
Please note that I will be at a conference the rest of the week.
So I will respond next week upon my return.

If anyone else is interested in running RAC in zones,
please contact me.

Regards,
Ellard

Dr. Hung-Sheng Tsao (LaoTsao) wrote:
 
 
 Eric Li wrote:
 Dear All,

 Our customers like to run existing Oracle database inside Solaris 8/9 
 container using Sun Cluster. Please kindly advise if
 - Is this configuration certified by Oracle?
   
 not
 - Will it be supported by Oracle?
   
 not
 - Will Sun Cluster support this? (Sun Cluster 3.2 02/08?)
   
 ha-zone
 - Any references?

 Thank you in advance for your help.

 Best regards,
 Eric
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
   
 
 
 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zone management

2008-07-24 Thread Ellard Roush
Hi Nathan,

The Sun Cluster organization is introducing a new cluster Brand zone,
that is based upon the native Brand zone plus hooks for Sun Cluster.
It would be nice if the design were flexible enough so that we
could leverage your proposed tools for a Brand zone other
than native, or at least for a Brand that is almost native.

Regards,
Ellard Roush

Nathan Dietsch wrote:
 Hello All,
 
 I am looking to the OpenSolaris community for input on your respective  
 experiences with management tools for zones.
 
 I am currently looking at the options for managing zones and am  
 looking for a tool that would let me;
 
 * Install new zones according to a template
 * Clone existing zones
 * Detach and attach zones on different systems to facilitate migration
 * Flag an update on attach operation for a zone (I realise that this  
 has not yet been implemented in Solaris 10)
 * Patch a zone
 * Shutdown/restart a zone
 * Handle the management of SMF services within a zone
 
 Nice to have, but not overly necessary
 
 * Integration with Solaris Resource Manager
 * Deploy a server personality to a zone (Packages, Conf Files, Mounts,  
 SMF services etc)
 
 I know that xVM Ops Centre can handle most of these tasks, but I am  
 not sure that it handles the zone migration or update on attach  
 components and I have briefly read about Container Manager which is  
 part of SMC.  I know that both Sun Cluster and VCS can handle the zone  
 migration components, but they do not have the scope to handle the  
 other tasks. Are there any other tools out there that handle these  
 sort of tasks? How do you manage zones in your respective environments?
 
 Any and all input is much appreciated.
 
 Kind Regards,
 
 Nathan Dietsch 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Patches vs Updates - Zone Features

2008-04-21 Thread roush
Hi Enda,

Enda O'Connor ( Sun Micro Systems Ireland) wrote:
 Ellard Roush wrote:
 Hi,

 Solaris 10 update 4 introduced the BrandZ feature set.
 Solaris 10 update 5 will introduce more zone features.
 Today Sun Cluster requires the the Solaris 10 release be at
 least up to the Solaris 10 update 3 level.

 We are proposing to ship a new feature in Sun Cluster
 that will use the BrandZ feature set and support
 the new zone features in Solaris 10 update 5.
 Naturally, this new feature will only be operational
 when the customer installs Solaris 10 update 5.

 There are at least 2 ways to load new software.

 1) Install the Solaris 10 update 5 release

 In this case we know that everything works fine.

 2) Install patches for Solaris 10 update 5.
 This approach loads all of the bug fixes,
 and does not load the new packages.

 If a customer installs patches,
 will BrandZ and all the new zone features
 of Solaris 10 update 5 work ?
 Or will the customer just get the bug fixes ?
 they'll get everything in this case.
 127127-11/127128-11 is the u5 kernel patch that will deliver all this.
 What features are you interested in.
 
We are using the BrandZ framework to support a cluster Brand zone that
is the same as the native Brand zone with hooks added
for our software. For
example, we use the callbacks to learn when zones change state up vs
down, while we still execute the original native brand functionality
in these cases.

We are going support a Zone Cluster, which is a virtual cluster,
where each virtual node is a cluster brand zone. This will enable
us to support cluster applications inside a zone environment.

This means that we need S10u4 in order to get the BrandZ feature set.
We also would like to support the new zone features of S10u5,
which will probably include hard caps on CPU's.

 As for installing patcehs on zones systems there are a few things they 
 need to be aware of.
 1 install latest patch utils first (119254/119255)
 2 always run patchadd -a patch-id first before installing the patch.
 
 the -a does a dryrun and especially in the case of zones, will catch 
 issues like zones dependency issues/unbootable zones etc. No files get 
 modified, so it allows you to identify certain types of issues ( not all 
 issues mind you ).
 The patchadd -a output is pretty hard to parse, but make sure to examine 
 closely for any issues relating to zones etc.
 
 Enda
 


Thanks for the information.
Ellard


 ---
 Our people in the field say that customers are
 much more willing to install patch
 as opposed to installing an update.

 Some aspects of patching are murky.
 So your help is appreciated.

 Regards,
 Ellard
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
 
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] The quick dirty guide to zones on iSCSI LUNs

2008-04-03 Thread Ellard Roush
Hi James,

James Carlson wrote:
 Ellard Roush writes:
 James Carlson wrote:
 That point in time is as soon as your application can start.  It need
 not have any dependencies at all.

 Here is the other point that needs to be clarified.
 This is not an application.
 Applications do not start until much later.
 We have to get the cluster formed and cluster services established
 before applications run.
 
 We probably have different definitions of that term.  For networking,
 an application is something that uses the services provided by a
 transport or (for raw sockets) network layer protocol.
 
 I'm not talking about user applications; just things that use
 networking services in some way.
 
OK. Now I understand what you mean.

 Your program (whatever it is) should not need dependencies on
 networking in order to be successful.  As I suggested before, it's
 sometimes helpful to listen to routing sockets (you can get hints
 there about when it might be a good time to shorten a retry timer, and
 thus make your program respond more quickly), but it's not really a
 dependency issue.
 
 The internal interfaces that we had to use are not well documented.
 Your explanation helps understand what is probably going on.
 
 It's hinted at in the documentation, but not as well-documented as it
 should be.  man -s 3socket connect says:
 
  underlying transport provider. Generally, stream sockets can
  successfully connect() only once. Datagram sockets  can  use
 [..]
  ECONNREFUSED The  attempt  to  connect  was   forcefully
   rejected.   The   calling   program  should
   close(2) the socket descriptor,  and  issue
   another  socket(3SOCKET)  call  to obtain a
   new descriptor  before  attempting  another
   connect() call.
 
 That generally is also true for most unsuccessful connect() calls
 and the advice under ECONNREFUSED is actually true for pretty much all
 failures.  The exceptions are the non-failure failures -- EALREADY,
 EINPROGRESS, and EWOULDBLOCK.  I think that issue is what the text is
 trying to dance around.
 
 You're partly connected (at least bound) after the real failures, and
 getting back to a clean state is easiest just by close() and trying
 again.
 
 The usual references (Stevens and others) have more detailed
 discussions.  The underlying problem is that for much of the BSD
 world, the code *is* the documentation, so whatever sockets did, well,
 that's what they do.
 
 (For what it's worth, this isn't even one of the darker corners.  Raw
 socket behavior, for example, varies in mysterious ways across OS
 platforms and even across releases of a given OS.)
 
Thanks for the explanation. Our Quorum Server uses the
approach that you suggested. We discovered it the hard way.
We are now attempting to use iSCSI devices as quorum devices.
I will share your insight with the iSCSI people.

Regards,
Ellard
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] The quick dirty guide to zones on iSCSI LUNs

2008-04-02 Thread Ellard Roush
Hi James,

It is already well known that routes come and go.
It is already well known that the way to determine
whether a destination is reachable is to attempt to
contact that destination.
That is NOT the issue that I am raising.

We have seen the following PROBLEM.
Our code has a dependency upon Solaris network routing.
After SMF reports that Solaris network routing initialization
has begun. We attempt to contact the quorum device.
That attempt fails. We wait and retry.
ALL SUBSEQUENT retries fail !!!
If we make the code sleep long enough for Solaris routing to
complete initialization, then after a failed attempt
to connect, then retries work whenever the route becomes
available. The problem is that Solaris routing goes into
an error state when we attempt to connect before it is ready.
SMF starts services as soon as the service dependencies are satisfied.
So we can and do attempt our first connection before
Solaris routing is really ready !

We are not asking for indication as to when a route is present.
We want to know when we can attempt to establish a connection
without Solaris routing going into an error state that
causes all subsequent attempts to connect to fail.

We have found another recovery method for this problem.
We do not just retry the connection.
We destroy all network data structures (socket)
This clears the bad state. retries then eventually succeed.

Regards,
Ellard


James Carlson wrote:
 Ellard Roush writes:
 Thanks for explaining about how the routing situation changes dynamically.
 However, we have been aware of that for a long time.

 Sun Cluster (SC) is a High Availability product.
 We have customers that want recovery to occur in less than 2 seconds.
 While we have not achieved that goal, we are working in that direction.
 This means that some operations MUST complete very quickly.
 A late completion of an operation is a failure.
 
 Understood.
 
 As a general principle, though, you cannot demand that other systems
 do anything you want at any other time.  When networking is involved,
 other independent systems are involved.
 
 In other words, I think the focus is on the wrong level here.  The
 whole deployment -- the routers, bridges, and other infrastructure
 included -- must be designed to meet your goal, not _just_ this one
 bit of Solaris software.  (And once that's done, the state of routing
 in Solaris may or may not be at issue.)
 
 More specifically, when a quorum device is unreachable for substantial
 periods of time, the unreachable quorum device is in a failed state
 as far as we are concerned. This is true even when the device
 might be reachable 60 seconds from now. The administrator
 must configure a quorum device that can be reached reliably
 in a short time period.
 
 The solution is easy at this level: send a packet.  If you get a
 sensible response, then that system is in fact reachable.  If you
 don't get a sensible response within the time constraint that you've
 set for yourself, then it's not.
 
 That's really the only information available.
 
 The current SMF information does not even tell us when the Solaris
 routing software can even accept attempts to communicate.
 
 That's correct.  As I've already outlined *it doesn't know* and (more
 importantly) *it cannot in principle know*.
 
 Or, if you prefer: it always accepts attempts to communicate.  It just
 won't always be successful in those attempts.
 
 We already
 know that the attempts can fail. Before the routing software in
 Solaris is ready, all attempts to communicate will fail.
 We just want to know when it is safe to try.
 We are not asking for a dependency upon when a specific route is present.
 We know that is not possible.
 We have encountered problems when an attempt is made before
 the routing software is ready.
 We want to access the quorum device as soon as we can for
 quicker recovery, but no sooner than can be achieved reliably.
 
 There's just no general solution to the problem.
 
 If the only thing you care about is whether routing has established a
 route to somewhere, then (as I mentioned before) you can listen to a
 routing socket to observe the resulting RTM_ADD.  I don't think
 that'll actually help you in your quest, but it's certainly doable and
 answers the immediate (and I think improperly formed) question of when
 routing software in Solaris is 'ready'.  For some value of ready,
 at least.
 
 There is simply *NO WAY* that the system can tell you a priori whether
 an attempt to transmit a packet will actually result in that packet
 being sent from the system (ARP can still fail and Spanning Tree can
 disable ports silently) or whether delivery is possible.
 
 Only sending data can do that, and only then in retrospect.  If you
 get an answer, then it must have worked.
 
 I strongly disagree that we should be offering any sort of routing is
 ready checkpoint or SMF dependency.  It'd be misleading at best, and
 would result in a new class of unsolvable failure modes

Re: [zones-discuss] The quick dirty guide to zones on iSCSI LUNs

2008-04-02 Thread Ellard Roush
Hi James,

James Carlson wrote:
 Ellard Roush writes:
 If we make the code sleep long enough for Solaris routing to
 complete initialization, then after a failed attempt
 to connect, then retries work whenever the route becomes
 available. The problem is that Solaris routing goes into
 an error state when we attempt to connect before it is ready.
 
 OK, it sounds like we're talking at cross-purposes here.
 
Yes. But we finally seem to be reaching an understanding.
That is progress.

 I haven't seen such a problem myself (it sounds like an application
 bug to me -- at a wild guess, possibly not handling dynamic interfaces
 correctly; see below).  File a bug on solaris/kernel/tcp-ip.
 
 The TCP/IP stack itself is responsible for taking user data and
 matching it against kernel routes (actually, they're forwarding
 entries).  The user space routing daemons (the things controlled by
 SMF) neither know nor _care_ what the kernel is doing with user data
 packets, so dependencies on them won't help anything.
 
 Even if some sort of error state is possible in the kernel (again, I
 haven't seen such a thing, at least not described in those terms), I
 don't see how routing daemons are involved here or how anything iSCSI
 can do would affect them.
 
 We are not asking for indication as to when a route is present.
 We want to know when we can attempt to establish a connection
 without Solaris routing going into an error state that
 causes all subsequent attempts to connect to fail.
 
 That point in time is as soon as your application can start.  It need
 not have any dependencies at all.
 
Here is the other point that needs to be clarified.
This is not an application.
Applications do not start until much later.
We have to get the cluster formed and cluster services established
before applications run.

 If you prefer, you may depend on this service so that at least lo0 is
 plumbed up when you start:
 
svc:/network/loopback:default
 
 Most networking applications don't even need that, though.
 
 We have found another recovery method for this problem.
 We do not just retry the connection.
 We destroy all network data structures (socket)
 This clears the bad state. retries then eventually succeed.
 
 It sounds to me like you're not dealing with dynamic interfaces
 correctly.
 
 If you don't explicitly bind a preferred address to use (most
 applications do not), then the kernel will choose an address for you.
 With UDP, this happens on a packet-by-packet basis.  With TCP, though,
 it happens once as the connect() request is started.
 
 When the kernel does this, it picks the best-matching kernel
 forwarding entry (at that moment in time) for the supplied destination
 IP address (UDP sendto() or TCP connect()), and then selects a source
 address based on the output interface that this entry points to.
 
 Other interfaces may come and go over time, other routes may be
 learned or forgotten, but we _never_ go back and rewire that TCP
 source address.  It perhaps doesn't sound like the best possible
 answer, but that's how BSD sockets have worked for many decades, and
 it's expected behavior.
 
 If connect() fails or if you need to give up for some reason, there's
 no way to unbind.  The proper procedure is to close the socket, and
 build a new one.
 
 I think you're barking up the wrong tree by attempting to establish
 some sort of dependency on routing.
 
The internal interfaces that we had to use are not well documented.
Your explanation helps understand what is probably going on.

Regards,
Ellard
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] The quick dirty guide to zones on iSCSI LUNs

2008-03-31 Thread roush


Christine Tran wrote:
 roush wrote:
 
 Sun Cluster plans to support an iSCSI disk as a quorum device.
 Sun Cluster accesses the iSCSI disk early in the boot process.
 When the iSCSI disk is on the same subnet as the cluster machines,
 things work. When the iSCSI disk is on a different subnet
 the system cannot find the iSCSI disk (ENXIO). However,
 after Solaris is fully up we have no access problems.
 Solaris automatically boots up zones in many configurations.
 The point at which Solaris boots zones is later, so
 you may or may not hit this problem. I would be
 interested to hear whether you encounter this problem or not.

 
 Hi Ellard,
 
 No, I have not encountered this problem.  The targets mount just in time 
 for my zones.  But it sounds to me like a dependency on 
 svc:/network/routing/route:default for cluster could help this along?
 
 CT

Hi Christine,

We have dependencies upon routing.
However, this dependency only let's us know when
initialization of routing started and does not
tell us when things are ready. iSCSI hides
the fact that a network is involved, which
complicated solving this issue. But we are working on it.
Thanks for the information. This helps confirm that
we have a startup ordering problem.

Regards,
Ellard
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] The quick dirty guide to zones on iSCSI LUNs

2008-03-28 Thread roush
Hi Christine,

Interesting report.

We also will be supporting the use of iSCSI with Sun Cluster.
Here is one specific problem that we have encountered that may or may
not affect you.

Sun Cluster plans to support an iSCSI disk as a quorum device.
Sun Cluster accesses the iSCSI disk early in the boot process.
When the iSCSI disk is on the same subnet as the cluster machines,
things work. When the iSCSI disk is on a different subnet
the system cannot find the iSCSI disk (ENXIO). However,
after Solaris is fully up we have no access problems.
Solaris automatically boots up zones in many configurations.
The point at which Solaris boots zones is later, so
you may or may not hit this problem. I would be
interested to hear whether you encounter this problem or not.

Regards,
Ellard Roush

Christine Tran wrote:
 What is iSCSI?
 SCSI over TCP/IP.  iSCSI makes remote disks look local.  The remote host 
 with storage resource presents iscsi targets.  The client accessing the 
 storage is the initiator.  iSCSI initiator was present in S10 3/05 and 
 up.  iSCSI target went into S10 8/07.
 
 Why zones on iSCSI?
 iSCSI frees you from the limitation of putting zones on local storage. 
 The physical bits of the zoneroot can live anywhere accessible with a 
 network connection.  You can use the zone detach/attach function without 
 SAN or shared storage.  This ability circumvents a bunch of problems 
 associated with zonepath on NFS mount, for example, see  RFE 4963321: 
 hosting root filesystems for zones on NFS servers.
 
 What's the catch?
 Speed of zone installation and patching depends on how fast your network 
 is.  Currently it doesn't look like you can do a standard upgrade on a 
 box with zones on iSCSI LUNs because there's no iSCSI packages in the 
 miniroot.
 
 What works?
 Installing and booting zones on iSCSI targets, patching in single-user 
 mode, upgrading via LiveUpgrade.
 
 How to do it?
 This is a quick write-up.  I used ZFS zvol but this is not necessary. 
 ZFS makes creating iscsi targets PAINLESS and takes only one command.  I 
 placed the zonepath on a striped SVM volume because I was testing a 
 specific config, for speed, and eventually I want to use an SVM mirror 
 to provide redundancy for my zonepath.  Most outputs are omitted, what's 
 provided is for clarity.
 
 1. create the targets
 2. client discovery of target
 3. label disk, lay down SVM, filesystem
 4. configure zones
 5. apply recommended patch cluster, LU patch cluster
 6. lucreate, luupgrade, luactivate
 
 nvd is a box running snv_80 but S10 8/07 is just as good.  Client is 
 running S10 8/07.
 
 nvd# zpool create tran1 c0t18d0 c0t19d0
 nvd# zpool create tran2 c0t20d0 c0t21d0
 nvd# zfs create -V 16g tran1/xmen
 nvd# zfs create -V 16g tran2/hulk
 nvd# zfs set shareiscsi=on tran1/xmen
 nvd# zfs set shareiscsi=on tran2/hulk
 nvd# iscsitadm list target -v
 Target: tran1/xmen
 iSCSI Name: iqn.1986-03.com.sun:02:4a46145b-8b71-69ab-8cee-c8a9c4367f0a
 Target: tran2/hulk
 iSCSI Name: iqn.1986-03.com.sun:02:f57bbbf8-3504-4d9e-8c2b-ddfa45cfb641
 
 
 ~ iscsiadm add static-config iqn.1986-03.com.sun:02:4a46145
 b-8b71-69ab-8cee-c8a9c4367f0a,129.154.158.154
 ~ iscsiadm add static-config iqn.1986-03.com.sun:02:f57bbbf
 8-3504-4d9e-8c2b-ddfa45cfb641,129.154.158.154
 ~ iscsiadm modify discovery --static enable
 ~ devfsadm -i iscsi
 ~ iscsiadm list target -S
 Target: iqn.1986-03.com.sun:02:f57bbbf8-3504-4d9e-8c2b-ddfa45cfb641
 OS Device Name: /dev/rdsk/c5t0103BA681D5F2A0047E84932d0s2
 Target: iqn.1986-03.com.sun:02:4a46145b-8b71-69ab-8cee-c8a9c4367f0a
 OS Device Name: /dev/rdsk/c5t0103BA681D5F2A0047E84934d0s2
 
 ~ format
 [...]
 8. c5t0103BA681D5F2A0047E84932d0 SUN-SOLARIS-1 cyl 
 32766 alt 2 hd4 sec 256
/scsi_vhci/[EMAIL PROTECTED]
 9. c5t0103BA681D5F2A0047E84934d0 SUN-SOLARIS-1 cyl 
 32766 alt 2 hd4 sec 256
/scsi_vhci/[EMAIL PROTECTED]
   label
   partition
 
 [Striping, nologging and noatime for speed]
 ~ metainit d30 1 2 c5t0103BA681D5F2A0047E84932d0s0
 c5t0103BA681D5F2A0047E84934d0s0 -i 32k
 ~ newfs -v /dev/md/dsk/d30
 ~ mount -F ufs -o nologging,noatime /dev/md/dsk/d30 /zones
 [You need the mount-at-boot option == yes, otherwise it would not mount 
 at boot, despite what the mount(1M) manpage says]
 ~ vi vfstab
 /dev/md/dsk/d30 /dev/md/rdsk/d30 /zones ufs 1 yes nologging,noatime
 
 ~ zonecfg -z zone1
 zonecfg:zone1 create
 zonecfg:zone1 set zonepath=/zones/zone1 [...]
 ~ zoneadm -z zone1 install
 ~ zoneadm -z zone1 boot
 
 
 {1} ok boot -s
 Entering System Maintenance Mode
 [iSCSI Initiator is present]
 ~ modinfo |grep -i iscsi
   36  13252e8  2b4a0 271   1  iscsi (Sun iSCSI Initiator v20061003-0)
 [Target LUNS are present]
 ~ iscsiadm list target
 Target: iqn.1986-03.com.sun:02:f57bbbf8-3504-4d9e-8c2b-ddfa45cfb641
 Target: iqn.1986-03.com.sun:02:4a46145b-8b71-69ab-8cee-c8a9c4367f0a
 [boot zones, apply patch cluster and LU patch cluster. sunsolve.sun.com 
 has

Re: [zones-discuss] updating a zone when attaching

2007-06-04 Thread roush

Hi Jerry,

This proposal mentions native zones.
Please ensure that the cluster brand is treated
as a native brand, as noted in PSARC 2007/304.

By the way PSARC 2007/304 was approved last week.
The changes are now in Nevada. We have been working
with the ON  Install gate C-teams. The changes will
go into the S10u4 gates once we receive notification of what date
they want the putback to occur. After the changes
get into both S10u4 gates, I will return to discuss the
long-term solution for S10u5.

Regards,
Ellard

Jerry Jelinek wrote:

Enclosed is a draft of an ARC fast-track proposal I have been
working on recently, in-between a few other things.  I would
like to submit this for ARC review shortly but I wanted to
send this out to see if anybody had any comments before I
do that.  I have cc-ed the install-discuss alias as well, since
there is some overlap, although this is probably most interesting
to zones folks.

One additional comment.  I believe this proposal should also address
the recurring question about being able to migrate a zone from
sun4u to sun4v (and back).

Please send me any comments or questions.

Thanks,
Jerry

---

SUMMARY:

This fast-track enhances the Solaris Zones [1] subsystem to address an
existing RFE [2] requesting the ability to update a non-global zone 
when

migrating from one machine to another.

Currently when we migrate a zone we validate that the destination 
host has
the same pkg versions and patches for the zone-dependent packages as 
were
installed on the source host.  This is described in the zone 
migration ARC
case [3].  While this is safe and ensures that the new host is 
capable of

properly supporting the zone, it is also very restrictive.  With this
enhancement, if the new host has higher versions of the zone-dependent
pkgs, or higher versions of patches for those pkgs, then when we 
attach the
zone to the new host we will enable an update of the pkgs in the 
zone to

match the new host.

Patch binding is requested for this update on attach capability.  The
stability of these interfaces is documented in the interface table 
below.


DETAILS:

Update on attach is different from a traditional zone upgrade.  In 
the
traditional upgrade all native zones are upgraded as part of 
upgrading the
base system using a standard Solaris media image as the source for 
the pkgs

to upgrade to.  Pkg operations on pkgs with the SUNW_ALLZONES attribute
set must be run from the global zone, the operation will be 
performed on

all native zones, and this behavior is built-in to the pkg commands.

With update on attach we are only updating a single zone.  We cannot
depend on the basic pkg behavior which updates all zones when a pkg is
installed in the global zone.  We cannot use standard Solaris media 
since

the host can have a variety of patches installed which have updated the
base system pkgs beyond any specific Solaris release.

Instead what we want to do is similar to what happens when a zone is
initially installed.  The spooled pkg data and global zone files are 
the

source for installing the zone.  In this way the zone is installed with
the correct pkg versions along with any patches that have been 
applied to

those pkgs.

We can do something similar for update on attach.  The zone 'attach'
validation already generates a list of mismatched pkg versions and 
patches.

We can use this information to determine which dependent pkgs need to
be updated so that the zone can run properly on the new host.  We will
remove the obsolete versions of those pkgs and install the up-to-date
version from the pkg data spooled in the global zone.  This 
procedure will
preserve any editable or volatile files that are delivered by these 
pkgs.
The normal pkg install scripts and class action scripts are run as 
part of
this process so any updates performed by these scripts will take 
place.  As

described in [3] the dependent pkgs are those that have the
SUNW_PKG_ALLZONES=true pkg attribute as well as any pkgs installed 
in an
inherited-pkg-dir.  Only these pkgs will be updated to match the new 
host.


We will ensure that we will only update a zone to a host running the 
same
or later version of the dependent pkgs.  For example, if the new 
host has
a mix of higher and lower version patches as compared to the source 
host

then we will not allow an update during the attach.

By default the zone will not be updated during attach.  Instead, the
existing output listing the pkgs that are out of sync will continue to
be printed.  We will add a new option (-u) to the 'zoneadm attach'
subcommand.  When this option is used then zoneadm will update the
necessary pkgs during the attach (assuming there are any to update).

Because the zone has previously booted and run on the source host it is

Re: [zones-discuss] Why is mount disabled for branded zones

2007-05-08 Thread Ellard Roush

Hi Enda,

This provides a good opportunity to clear up some misinformation.

The BrandZ lx zone type does not use standard patch/package
commands.

There will be BrandZ zone types that do use standard patch/package
commands. The Cluster group is developing now a cluster BrandZ
zone type that uses the BrandZ callbacks to enhance a zone.
The cluster BrandZ uses standard patch/package commands.
The Zones  BrandZ team in Solaris told us that a BrandZ approach
was the correct way to enhance a zone.

We are now in the middle of correcting these problems.

If you have information about places where this problem
appears, please let us know so that we can fix the problem.

Thanks,
Ellard


Enda O'Connor ( Sun Micro Systems Ireland) wrote:

Tirthankar wrote:

Hi,

On my machine (running s01u4_06) I have 3 local zones.

pship2 @ / $ zoneadm list -cv
 ID NAME STATUS PATH   
BRANDIP
  0 global   running/   
native   shared
  2 cz2  running/zones/cz2 
my_brand  shared
  5 cz4  running/zones/cz4 
native   shared
  - cz3  installed  /zones/cz3 
lx   shared

pship2 @ / $

cz2 is my_brand branded zone

pship2 @ / $ zoneadm -z cz2 mount
zoneadm: zone 'cz2': mount operation is invalid for branded zones.

Why is mount command disallowed for a branded zone ?
I can boot the zone, using the normal zoneadm -z cz2 boot command

Note: The config.xml and platform.xml for my_brand is identical to 
the native brand except for the brand name.



Hi
mount is an internal state used by the patch/package commands only.
It basically does some mount magic, such that the zone's zone is mounted 
in from the global lofs, plus /dev etc. Not really applicable to a zone 
that is not native as it cannot be patched.



Enda
___
zones-discuss mailing list
zones-discuss@opensolaris.org

___
zones-discuss mailing list
zones-discuss@opensolaris.org