ZFS doing insane I/O reads

2012-02-27 Thread Ram
I just deployed zfs on my newer cyrus servers.
These servers get less than 2000 mails per hour and around 400 
concurrent pop/imap connections


I have seen that even if there is no incoming pop or imap connection 
still there is large amount of READ happenning on the zfs partitions.
Is this normal behaviour for an imap server. Iostat shows sometimes upto 
2000 TPS

The reads are infact more than 10x of what writes are. I am afraid I 
will be trashing the  harddisk.
Do I need to tune ZFS specially for cyrus  ?


This is the typical zpool iostat output

zpool iostat 1
poolalloc   free   read  write   read  write
--  -  -  -  -  -  -
imap 145G   655G418 58  18.0M  1.78M
imap 146G   654G258118  8.28M   960K
imap 145G   655G447146  19.4M  4.37M
imap 145G   655G413 32  19.4M  1.46M
imap 145G   655G339  4  14.8M  20.0K
imap 145G   655G341 40  15.7M   755K
imap 145G   655G305 10  15.0M  55.9K
imap 145G   655G328 12  14.8M   136K






Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


Re: ZFS doing insane I/O reads

2012-02-27 Thread Eric Luyten
On Mon, February 27, 2012 11:10 am, Ram wrote:
 I just deployed zfs on my newer cyrus servers.
 These servers get less than 2000 mails per hour and around 400
 concurrent pop/imap connections


 I have seen that even if there is no incoming pop or imap connection
 still there is large amount of READ happenning on the zfs partitions. Is this
 normal behaviour for an imap server. Iostat shows sometimes upto 2000 TPS


 The reads are infact more than 10x of what writes are. I am afraid I
 will be trashing the  harddisk. Do I need to tune ZFS specially for cyrus  ?



 This is the typical zpool iostat output


 zpool iostat 1
 poolalloc   free   read  write   read  write
 --  -  -  -  -  -  -
 imap 145G   655G418 58  18.0M  1.78M
 imap 146G   654G258118  8.28M   960K
 imap 145G   655G447146  19.4M  4.37M
 imap 145G   655G413 32  19.4M  1.46M
 imap 145G   655G339  4  14.8M  20.0K
 imap 145G   655G341 40  15.7M   755K
 imap 145G   655G305 10  15.0M  55.9K
 imap 145G   655G328 12  14.8M   136K


Ram,

We have a single Cyrus server about ten times as busy as yours with four ZFS
pools (EMC Celerra iSCSI SAN) for message stores ; all the databases, quota
and seen information are on an internal server SSD based (mirror) pool.
We also have a few GB of SSD based ZIL (synchronous write cache) per pool.


Here is our 'zpool iostat 1' output :

   capacity operationsbandwidth
poolalloc   free   read  write   read  write
--  -  -  -  -  -  -
cpool1   901G  2.96T 22 32   422K   286K
cpool2  1.18T  2.66T 29 45   578K   459K
cpool3  1.00T  2.84T 24 34   456K   314K
cpool4   993G  2.87T 25 35   455K   328K
ssd 7.49G  22.3G  4 35  17.2K   708K
--  -  -  -  -  -  -
cpool1   901G  2.96T 45 16   670K   759K
cpool2  1.18T  2.66T 47 25   565K   603K
cpool3  1.00T  2.84T 33 13   410K   483K
cpool4   993G  2.87T 12  8   525K   244K
ssd 7.49G  22.3G 13210  49.4K  10.8M
--  -  -  -  -  -  -
cpool1   901G  2.96T 20 22  77.9K  2.15M
cpool2  1.18T  2.66T 25  4   937K   128K
cpool3  1.00T  2.84T 20 91   324K  11.0M
cpool4   993G  2.87T 17 13   844K  83.9K
ssd 7.49G  22.3G  6237  20.0K  20.9M
--  -  -  -  -  -  -
cpool1   901G  2.96T  0  0   1023  0
cpool2  1.18T  2.66T 12 21   146K  1.26M
cpool3  1.00T  2.84T  8 26  46.5K  2.28M
cpool4   993G  2.87T 11  4   353K  24.0K
ssd 7.49G  22.3G 17135  99.4K  8.12M
--  -  -  -  -  -  -
cpool1   901G  2.96T  4  0  80.9K  4.00K
cpool2  1.18T  2.66T  7  6   133K  28.0K
cpool3  1.00T  2.84T  6  0  16.5K  4.00K
cpool4   993G  2.87T  4  4   149K  20.0K
ssd 7.49G  22.3G  9 76  51.0K  4.24M
--  -  -  -  -  -  -
cpool1   901G  2.96T 12  0   269K  4.00K
cpool2  1.18T  2.66T 19  0   327K  4.00K
cpool3  1.00T  2.84T  7  3  11.0K  16.0K
cpool4   993G  2.87T  5 95   167K  11.4M
ssd 7.49G  22.3G  4226  17.5K  25.2M
--  -  -  -  -  -  -
cpool1   901G  2.96T 14 20   311K  1.22M
cpool2  1.18T  2.66T 19 15  85.4K  1.39M
cpool3  1.00T  2.84T  6  6  5.49K  40.0K
cpool4   993G  2.87T  4 15  17.0K  1.70M
ssd 7.49G  22.3G  6151  21.5K  13.1M
--  -  -  -  -  -  -
cpool1   901G  2.96T 56 15  2.11M   559K
cpool2  1.18T  2.66T 13  7  18.5K  32.0K
cpool3  1.00T  2.84T  5  4  54.4K   392K
cpool4   993G  2.87T 17  2  66.4K   136K
ssd 7.49G  22.3G  6109  45.9K  8.29M
--  -  -  -  -  -  -
cpool1   901G  2.96T 38 19   228K  1.89M
cpool2  1.18T  2.66T 29 11   160K   300K
cpool3  1.00T  2.84T  4  4  11.5K  24.0K
cpool4   993G  2.87T  9  8  31.5K  56.0K
ssd 7.49G  22.3G 12150  46.0K  12.1M
--  -  -  -  -  -  -
cpool1   901G  2.96T 32  1   106K   256K
cpool2  1.18T  2.66T 46  5   692K  95.9K
cpool3  1.00T  2.84T  7 13   189K   324K
cpool4   993G  2.87T  4  0  29.0K  4.00K
ssd 7.49G  22.3G 25 96   149K  8.08M
--  -  -  -  -  -  -


Q1 : How much RAM does your server have ?
 Solaris 10 uses all remaining free RAM as ZFS read cache.
 We have 72 GB of RAM in our 

Re: how to authenticate on localhost without password?

2012-02-27 Thread Dan White

On 02/26/12 12:36 -0500, Brian J. Murrell wrote:

Subject might be a bit misleading but here is the problem...

I have a cyrus imap server serving a userbase.  Of course with any mail
system comes the issue of handling spam.  My users each have two folders
in their account: Junk and Not Junk where they put their spam and
mis-identifed spam.

On the imap server each user has a system (i.e. linux) account complete
with a SpamAssassin configuration including bayesian classification
database, etc. so that each user has their own database of what's spam
and what isn't.

That means that for each user to classify their spam/ham the sa-learn
process has to run as their own uid.  To achieve that goal, as well as
timely processing of the spam and ham folders, each user has a process
on the mail server running as their uid which monitors those mailboxes
and processes them (and/or each user has jobs run from their cron to
periodically do the same).

The question comes now, how can I have a master process which spawns all
of these per-user threads/processes give them some sort of credential
that allows them to get access to their imap account, without storing a
list of accounts/passwords in a file that would need to keep
synchronized with their system passwords (not to mention the security
nightmare it would be to store account passwords in plaintext).

FWIW, this configuration is Kerberos authenticated/authorized.


Hiemdal KCM should be able to handle renewing kerberos credentials for your
users.

Another option would be to utilize SASL EXTERNAL authentication to
authenticate your users, locally, based on peercred. Cyrus IMAP does not
currently have support for external auth, but I'm attaching a Linux
specific patch, against cyrus 2.3.12, which works for me.

I'm not sure how your spam processing fits into the picture, but your
spawned processes will need to function as IMAP clients, and will need to
be able to select the GSSAPI or EXTERNAL SASL mechanisms to use either of
the above scenarios.


Or is there some alternative interface to the cyrus imap folder
mechanism (i.e. not through the IMAP protocol) that I am completely
missing, that would be better suited to this problem?

One possible solution I can think of that would use the IMAP protocol
for all of this is to create a single IMAP account that will be given
access (i.e. using cyrus' ACLs) to every users' Junk, Not Junk and INBOX
folders in order to read the messages, learn them and in the case of
ham, move them back to their INBOX.

But before I go down this road I just want to make sure it's really the
right road or if there is some alternative that I am just not
recognizing yet.


--
Dan White
diff -ruN cyrus-imapd-2.3.12.pristine/imap/imapd.c cyrus-imapd-2.3.12/imap/imapd.c
--- cyrus-imapd-2.3.12.pristine/imap/imapd.c	2008-04-13 10:40:29.0 -0500
+++ cyrus-imapd-2.3.12/imap/imapd.c	2008-04-22 23:14:20.0 -0500
@@ -106,6 +106,7 @@
 #include xmalloc.h
 #include xstrlcat.h
 #include xstrlcpy.h
+#include pwd.h
 
 #include pushstats.h		/* SNMP interface */
 
@@ -715,6 +716,8 @@
 char hbuf[NI_MAXHOST];
 int niflags;
 int imapd_haveaddr = 0;
+struct ucred pc;
+socklen_t pclen = sizeof(pc);
 
 signals_poll();
 
@@ -780,8 +783,25 @@
 	saslprops.ipremoteport = xstrdup(remoteip);
 	sasl_setprop(imapd_saslconn, SASL_IPLOCALPORT, localip);
 	saslprops.iplocalport = xstrdup(localip);
+} else {
+	if (getsockopt(0, SOL_SOCKET, SO_PEERCRED, (void *)pc, pclen) == 0) {
+	struct passwd *pw = getpwuid(pc.uid);
+	int result;
+	result = sasl_setprop(imapd_saslconn, SASL_AUTH_EXTERNAL, pw-pw_name);
+	if (result != SASL_OK) {
+		return -1;
+	}
+	if(saslprops.authid) {
+		free(saslprops.authid);
+		saslprops.authid = NULL;
+	}
+	if(pw-pw_name) {
+		saslprops.authid = xstrdup(pw-pw_name);
+	}
+	}
 }
 
+
 proc_register(imapd, imapd_clienthost, NULL, NULL);
 
 /* Set inactivity timer */

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/

Re: how to authenticate on localhost without password?

2012-02-27 Thread Dan White
On 02/27/12 10:32 -0600, Dan White wrote:
Another option would be to utilize SASL EXTERNAL authentication to
authenticate your users, locally, based on peercred. Cyrus IMAP does not
currently have support for external auth, but I'm attaching a Linux
specific patch, against cyrus 2.3.12, which works for me.

I'm not sure how your spam processing fits into the picture, but your
spawned processes will need to function as IMAP clients, and will need to
be able to select the GSSAPI or EXTERNAL SASL mechanisms to use either of
the above scenarios.

I forgot to mention that to use the EXTERNAL mechanism in this way, you'll
need to spawn an imapd process on a unix socket. E.g., in /etc/cyrus.conf:

imapunixcmd=imapd -U 30 listen=/var/run/cyrus/socket/imap

And your IMAP client will need the capability to speak to an IMAP server
over that unix socket, like:

socat -d READLINE /var/run/cyrus/socket/imap
(c01 AUTHENTICATE EXTERNAL)

-- 
Dan White

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


Re: ZFS doing insane I/O reads

2012-02-27 Thread Ram
On 02/27/2012 04:16 PM, Eric Luyten wrote:
 On Mon, February 27, 2012 11:10 am, Ram wrote:
 I just deployed zfs on my newer cyrus servers.
 These servers get less than 2000 mails per hour and around 400
 concurrent pop/imap connections


 I have seen that even if there is no incoming pop or imap connection
 still there is large amount of READ happenning on the zfs partitions. Is this
 normal behaviour for an imap server. Iostat shows sometimes upto 2000 TPS


 The reads are infact more than 10x of what writes are. I am afraid I
 will be trashing the  harddisk. Do I need to tune ZFS specially for cyrus  ?



 This is the typical zpool iostat output


 zpool iostat 1
 poolalloc   free   read  write   read  write
 --  -  -  -  -  -  -
 imap 145G   655G418 58  18.0M  1.78M
 imap 146G   654G258118  8.28M   960K
 imap 145G   655G447146  19.4M  4.37M
 imap 145G   655G413 32  19.4M  1.46M
 imap 145G   655G339  4  14.8M  20.0K
 imap 145G   655G341 40  15.7M   755K
 imap 145G   655G305 10  15.0M  55.9K
 imap 145G   655G328 12  14.8M   136K

 Ram,

 We have a single Cyrus server about ten times as busy as yours with four ZFS
 pools (EMC Celerra iSCSI SAN) for message stores ; all the databases, quota
 and seen information are on an internal server SSD based (mirror) pool.
 We also have a few GB of SSD based ZIL (synchronous write cache) per pool.


 Here is our 'zpool iostat 1' output :

 capacity operationsbandwidth
 poolalloc   free   read  write   read  write
 --  -  -  -  -  -  -
 cpool1   901G  2.96T 22 32   422K   286K
 cpool2  1.18T  2.66T 29 45   578K   459K
 cpool3  1.00T  2.84T 24 34   456K   314K
 cpool4   993G  2.87T 25 35   455K   328K
 ssd 7.49G  22.3G  4 35  17.2K   708K
 --  -  -  -  -  -  -
 cpool1   901G  2.96T 45 16   670K   759K
 cpool2  1.18T  2.66T 47 25   565K   603K
 cpool3  1.00T  2.84T 33 13   410K   483K
 cpool4   993G  2.87T 12  8   525K   244K
 ssd 7.49G  22.3G 13210  49.4K  10.8M
 --  -  -  -  -  -  -
 cpool1   901G  2.96T 20 22  77.9K  2.15M
 cpool2  1.18T  2.66T 25  4   937K   128K
 cpool3  1.00T  2.84T 20 91   324K  11.0M
 cpool4   993G  2.87T 17 13   844K  83.9K
 ssd 7.49G  22.3G  6237  20.0K  20.9M
 --  -  -  -  -  -  -
 cpool1   901G  2.96T  0  0   1023  0
 cpool2  1.18T  2.66T 12 21   146K  1.26M
 cpool3  1.00T  2.84T  8 26  46.5K  2.28M
 cpool4   993G  2.87T 11  4   353K  24.0K
 ssd 7.49G  22.3G 17135  99.4K  8.12M
 --  -  -  -  -  -  -
 cpool1   901G  2.96T  4  0  80.9K  4.00K
 cpool2  1.18T  2.66T  7  6   133K  28.0K
 cpool3  1.00T  2.84T  6  0  16.5K  4.00K
 cpool4   993G  2.87T  4  4   149K  20.0K
 ssd 7.49G  22.3G  9 76  51.0K  4.24M
 --  -  -  -  -  -  -
 cpool1   901G  2.96T 12  0   269K  4.00K
 cpool2  1.18T  2.66T 19  0   327K  4.00K
 cpool3  1.00T  2.84T  7  3  11.0K  16.0K
 cpool4   993G  2.87T  5 95   167K  11.4M
 ssd 7.49G  22.3G  4226  17.5K  25.2M
 --  -  -  -  -  -  -
 cpool1   901G  2.96T 14 20   311K  1.22M
 cpool2  1.18T  2.66T 19 15  85.4K  1.39M
 cpool3  1.00T  2.84T  6  6  5.49K  40.0K
 cpool4   993G  2.87T  4 15  17.0K  1.70M
 ssd 7.49G  22.3G  6151  21.5K  13.1M
 --  -  -  -  -  -  -
 cpool1   901G  2.96T 56 15  2.11M   559K
 cpool2  1.18T  2.66T 13  7  18.5K  32.0K
 cpool3  1.00T  2.84T  5  4  54.4K   392K
 cpool4   993G  2.87T 17  2  66.4K   136K
 ssd 7.49G  22.3G  6109  45.9K  8.29M
 --  -  -  -  -  -  -
 cpool1   901G  2.96T 38 19   228K  1.89M
 cpool2  1.18T  2.66T 29 11   160K   300K
 cpool3  1.00T  2.84T  4  4  11.5K  24.0K
 cpool4   993G  2.87T  9  8  31.5K  56.0K
 ssd 7.49G  22.3G 12150  46.0K  12.1M
 --  -  -  -  -  -  -
 cpool1   901G  2.96T 32  1   106K   256K
 cpool2  1.18T  2.66T 46  5   692K  95.9K
 cpool3  1.00T  2.84T  7 13   189K   324K
 cpool4   993G  2.87T  4  0  29.0K  4.00K
 ssd 7.49G  22.3G 25 96   149K  8.08M
 --  -  -  -  -  -  -


 Q1 : How much RAM does 

Re: ZFS doing insane I/O reads

2012-02-27 Thread Pascal Gienger
Le 28/02/2012 07:13, Ram a écrit :

 This is a 16GB Ram server running Linux Centos 5.5 64 bit.
 There seems to be something definitely wrong .. because all the memory
 on the machine is free.
 (I dont seem to have fsstat on my server ..  I will have to get it
 compiled )

ZFS as FUSE?

We have Solaris 10 on x86(amd64) and we noticed that ZFS needs _RAM_, 
the more, the better.

On Solaris, using mdb you can look at the memory consumption (in pages 
of physical memory):

bash-3.2# mdb -k
Loading modules: [ unix krtld genunix specfs dtrace uppc pcplusmp 
cpu.generic zfs sockfs ip hook neti sctp arp usba fcp fctl qlc lofs sata 
fcip random crypto logindmux ptm ufs mpt mpt_sas ]
 ::memstat
Page SummaryPagesMB  %Tot
     
Kernel6052188 23641   36%
ZFS File Data 4607758 17999   27%
Anon  2115097  8262   13%
Exec and libs6915270%
Page cache  82665   3220%
Free (cachelist)   433268  16923%
Free (freelist)   3477076 13582   21%

Total16774967 65527
Physical 16327307 63778


As this is early in the morning, there are plenty of free pages in RAM 
(4 million), and the memory mapped executables of Cyrus IMAPd and shared 
libraries only consume 6915 pages, 27 MB.

1779 connections at this moment.

We had to go from 32 GB to 64 GB per node due to extreme lags in IMAP 
spool processing. And even with 64 GB when memory pressure from the 
Kernel and Anon (mapped pages without an underlying file: classical 
malloc() or mmap mapped on /dev/zero after COW) there are light 
degradations in access times on high volume hours. Another idea we had 
was the usage of a fast SSD as Layer 2 ARC (L2ARC) named cache on the 
zpool command line, based on the lru algorithm at the end the blocks 
containing the cyrus.*-files should be there. The problem lies in the 
fact that a pool with a local cache device and remote SAN (FiberChannel) 
storage won't be able to be imported automatically on another machine 
without replacing the faulty device. And for the price of an 
FC-enabled SSD you can buy MUCH RAM.

Does your CentOS system have some kind of trace to look for the block 
numbers which are read constantly? In Solaris I use dtrace to look for 
that and also for file based i/o to look WHICH files get read and 
written when there is starvation.


-- 
Pascal Gienger Jabber/XMPP/Mail: pascal.gien...@uni-konstanz.de
University of Konstanz, IT Services Department (Rechenzentrum)
Building V, Room V404, Phone +49 7531 88 5048, Fax +49 7531 88 3739
G+: https://plus.google.com/114525323843315818983/

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/