Hi Bron, all.

The suggestion proposed by Vladislav for the *Backup* mechanism would simplify the operation and reduce even more the lock time, so I also like the idea. Though, I see there 2 possible issues. First, each admin should ensure the command he/she passes as a param has absolutely no way to block, under no circumstances. Probably not a big deal for normal conditions, but corner cases should be analyzed.

Then, one may have highly granular privileges for every task, so the process that has access to the Cyrus daemon may not have filesystem modification access. IMO, there is no way (even in theory) to guarantee bug-free development with the current tools and practices, so the only feasible approach to security is compartmentalization and defense-in-depth principle <https://en.wikipedia.org/wiki/Defense_in_depth_%28computing%29>.

One possible arrangement here would be for a backup script to request a lock from the Cyrus daemon and then to signal via some IPC mechanism (could be as simple as creating a file in a special folder) to some other script (that is running with enough privileges for FS modifications but has no interaction with other components) that it's OK to perform the requested operation (e.g. a snapshot creation).

Even more, once chroot is implemented, cyrus_sync2disk executable (as all other processes) could run inside it and have no way of overseeing the entire filesystem. So if it has some security issues, it wouldn't affect the entire system, just the Cyrus daemon. For all this, the lock/unlock interface would be needed.

Anyway, these are just interface implementation details and could be easily adapted to the needs of the community once the most complex part, the global lock mechanism, is implemented. I'll update the #1763 issue with this comment and we could continue the discussion there.

Do you have an ETA for this? I can't talk for the entire community, but at least for me this is the most awaited feature.


With respect to "*Small sysadmin tasks*", I'll add the specifics in the #1764 issue.


As to *Push*, I didn't know a perl pusher layer was published, could you please indicate where it is? What I believe would be enough here is to have everything up to the point where a web request to APS would be needed (and IMO we could start with just a single server and when everything is working, adapt it for the replication).

Could you please explain the different mechanisms and components involved in the Apple push mechanism (e.g. the daemon interaction with the clients (the XAPPLEPUSHSERVICE extension), daemon -> notifyd communication, notifyd -> perl pusher, perl pusher -> web request to APS) and give us the current status for each of them (e.g. implemented in the repository, to be implemented, under an NDA but DIY with our assistance and contribute back, etc.)?

I know that Apple grants access on a case-by-case basis to some "privileged" implementation details (like those needed for OpenVPN iOS client) under NDAs, but in this case I can't see what functionality outside the Cyrus own code could be under an NDA. Is it the web request functionality? Or the XAPPLEPUSHSERVICE dialog with the mail.app (but this part is already implemented in the v3.0, right)? If I'm not wrong, the push itself (i.e. the mechanism to inform the APS servers about an event once there's a token from the client and the certs from Apple), is quite well documented, it's even more simple that the PushKit for VoIP apps. Could you please shed some light on this topic (maybe we should create an issue in github to follow this discussion and to track the advances)?


With respect to *Security*, sure it's enormous effort to considerably improve it in any old project in a single move. A gradual improvement would be a better approach. I would suggest starting some initiatives like the document-the-architecture and document-the-code sub-projects and contribute to them whenever time and other resources permit so they would be completed as a puzzle, piece by piece. And once some part is documented, any change to its code would be first implemented as a change in the corresponding documentation.

There are multiple benefits that greatly outweigh the expected overhead: first a developer would be able to /quickly/ become aware of the implementation details of the part to be modified (with a side effect of better understanding the implications of the intended change, i.e. less chances of breaking something and the change itself would be more aligned with the overall architecture). Then, while writing the documentation for the intended change, the developer could realize that there are better ways of achieving the same objective. And once the change is implemented, the developer would be able to complement the documentation for the undocumented parts with the insights he/she just gained. Some sort of a circular feedback.

Another benefit (depending on the internal organization of the dev team) is that more experienced developers could write the documentation for the changes (that would hold some relation with the formal specification), junior members would be those implementing the changes according to the documentation and, once ready, the senior developers would perform code reviews of the modifications - so even unexperienced developers and newcomers in your internal team could actively participate in the project. And of course the community would contribute more as now, without enough understanding of the internals and the overall architecture, it's a significant effort for an occasional contributor to implement any change at all.

Another initiative could be to formally define the security best practices and guidelines for the project and to ask everyone to try to follow them whenever possible. If you don't have anything similar yet, I'll see if I can contribute a draft.

And a security audit, IMO, should be a community-sponsored initiative, as probably no one has enough resources to sponsor it alone. But there should be someone starting the initiative ;)


As to the *chroot* implementation, my idea is to document in detail the process initialization part (that itself could serve as a base for the document-the-architecture/code sub-projects) so everyone who knows it well could inspect the documentation and make corrections. Once we all agree on the current implementation details, I'd describe the proposed changes and others (Greg) would be able to contribute their changes too. Again, once we all agree on them, everyone involved would provide corresponding patches. Then we'd repeat the above steps for the actual chroot changes.


Happy New Year!

Regards,
Anatoli

*From:* Bron Gondwana Via Cyrus-devel
*Sent:* Tuesday, December 27, 2016 21:04
*To:* Cyrus-devel
*Subject:* Re: Release plan blog post

Hi,

Sorry for the delay in responding to this - I left it over Christmas so I could sit down without distraction and reply when I was back in the office.

On Sat, 24 Dec 2016, at 17:09, Anatoli via Cyrus-devel wrote:
Hi Bron, all.

Thanks for the update and for the support of the project. That's great we'll see the 3.0 release soon!

Replying to your last paragraph in the blog post about the community needs, I believe that what's good for FM is mostly good for the community too. The FM team is probably the largest operator of the project and has a better view / face issues and special needs more frequently than anyone else, so your vision should suit well other project users too.

A few areas where I see the FM needs probably don't exactly match the needs of the community are the following 3.

*1. **Small (SMB) deployments* with a single server and somehow limited physical resources (e.g. disk space).

Here as an example comes the excellent backup mechanism Ellie implemented that suits well the needs of medium to large deployments, but IMO that's not the best approach for small deployments, as it requires a separate server or, if ran at the same server just for the safe data-to-disk synchronization, twice the disk space.

A better approach for small deployments, as I see it (and I believe it's highly demanded by the community), would be to have an executable that would instruct Cyrus daemon to synchronize to disk all the internal structures and lock (stop writing to disk) for a defined period. The lock could be implemented by hanging on network write requests or by writing them to temporary files, or by accumulating the changes in memory (the latter approach has a potential for data loss).

Once the flush is performed and the lock is applied, a (custom) backup script could create a snapshot of the partition that would hold the Cyrus data in a safe-to-backup state. Immediately after creating the snapshot, the lock would be released and the daemon would continue its normal operation. Then the backup script would be able to safely backup the data, e.g. create an incremental backup and upload it to some external storage, then destroy the snapshot.

Usage example: cyrus_sync2disk --lock=5 -> returns 0 when the data is synced and a lock for 5 seconds is obtained. cyrus_sync2disk --unlock -> returns 0 if the lock has been released and 1 if there was no active lock (e.g a previous lock has expired), so the backup script knows if it performed the required operations with the lock still in place or if it should perform the lock-snapshot-unlock operation again. The short timeout is to protect the daemon from an infinite lock if a backup script fails to unlock it.


I saw the reponse to this which suggested a "run a command under exclusive lock' which is definitely a better approach to this. I understand what you want here, and I mostly like the idea.

The one thing that gives me pause is that it requires a single lock against ALL cyrus processes. Right now, there's no global lock that processes take while making changes, and we'd need to add one. I would want to make it be something that needs to be turned on in config so that sites which DON'T need it don't have to pay the extra locking cost.

But the design is definitely viable. I want to do some other things with locking as well, like a single global lock for moves between users, renames, etc - so that we don't have lock ordering issues with those things.

https://github.com/cyrusimap/cyrus-imapd/issues/1763


*2. Small sysadmin tasks* for typical configurations that now require manual actions or writing one's own scripts. An example: new mailbox creation with particular flags (\Sent, \Junk, \Trash) set for special-use folders (that could be implemented as an extended functionality of the autocreate_inbox_folders option).

At FM you have everything automated for sure with your own customs scripts, but sysadmins with little experience with Cyrus or those that don't write scripts with ease would find some tasks difficult to accomplish, for others that's just an overhead/additional points of failure that could be avoided with small built-in automations.


This is a definitely interesting area for enhancement. The basic tool here is cyradm, and I think what we're really looking for is extending cyradm.

https://github.com/cyrusimap/cyrus-imapd/issues/1764

I'd love some more specific details here, including test plans ideally so that we can build and test these features. Or pull requests that do that :)


*3. New deployments* (vs ongoing upgrades/maintenance). How easy and straightforward it is to setup a new deployment (possibly migrating from other email servers). Here I'm referring to both the initial configuration, tools and documentation.


Yeah, we know about this one. I'm not going to create a specific bug for it, because it's kind of spread out over lots of different things. Nicola is working on improving our documentation, but again the best people to give advice are people who've recently done it. I haven't really "installed Cyrus from scratch" for 12 years, certainly not without the FastMail configuration and build systems. Except for the test environment, which has its own special magics.


*Push* is an area that is well implemented at FM, but there's no considerable advance in the Cyrus repository, and I believe the community needs in this area are mostly the same as the FM's.

The 3.0 release includes Apple push notifications support (XAPPLEPUSHSERVICE) and that's a good start. I haven't tried it yet and I understand that some effort would be required to make it work (the part that talks to the APS is not included and should be implemented independently). I do wonder why wouldn't FM share the notifier code & some documentation about how to make everything work? The only thing that'd be different in each deployment are the certificates. And it would be really exciting to have working apple push in Cyrus just after some typical setup steps.

If there are some impediments for the FM team to share their implementation details on mail and caldav/carddav push notifications, I'll try to make this feature work in my deployments in the near future and contribute to the project a detailed howto and the APS notifier code (but your assistance would be great).

Unfortunately part of that is under NDA, so we can't offer much more support there. When/if Apple open up their push infrastructure more, we'll definitely release the other parts of it.

I'm sure we've published at least part of our perl pusher layer before, though some of the session magic uses our sql infrastructure rather than storing sessions in Cyrus so that it survives failover between replicas. If we wanted to store them in Cyrus we'd need to have a replication protocol for key-value stores or some sort of replicated DB store.


And a general area that would benefit everyone, but that wasn't specifically mentioned in the blog post, is *Security*.

I don't mean Cyrus is insecure, and I do know that the FM team pays special attention to security of their infrastructure as a whole. Rather I would like to suggest that a special emphasis could be placed on Cyrus security from a development POV, e.g. to document in detail (and keep updated) the entire project's code base and its architecture, to follow most of the security development best-practices, to re-implement with security in mind some old/hacky parts of the system (they would become apparent during the documentation phase), to apply general hardening tactics (like chroot) or even to re-engineer the overall architecture for security, to perform internal security code reviews on a regular basis.

This is the kind of well meaning plan that leads down a massive rabbit hole. "Document in detail (and keep updated)". Such few words for so much work. We do bits and pieces of this as we can, and I've recently set up coverity to assess the project, and am working my way through its reports.

Certainly some parts of the code (like sieve) are a fricking mess, and could very well be hiding security issues because they're just so horrible. We fix them up as we have time and deal with them.


FM already had a security audit in 2014 (according to your previous blog posts), but you don't specify any details of how deep it was and what aspects it covered. Maybe an independent in-depth security audit with public results just for the Cyrus code base could be sponsored in collaboration with the community?

Again, unfortunately NDAs :(

Feel free to sponsor a security audit. I'd be happy to participate, but I can't justify funding it. I have an idea of where likely bugs are (URLAUTH, FETCH BODY[part] until recently when we rewrote it, maybe even message structure parsing) and I rewrite them to be safer when I deal with those bits of code, as do the rest of the team.

As for me as a member of the community, I have an intention to implement the chroot functionality for the daemon (late chroot like in OpenVPN). I've already discussed it briefly with Ellie and was hoping to make it ready for the 3.0 release, but had no time for it yet. To implement it correctly, first some important changes should be applied to the initialization logic (the moment of dropping the privs, it should be inside newly started processes, rather than in the master). This change should be carefully analyzed and it's a significant effort, I hope to be able to contribute it during the Q1/17. Once this change is implemented (which in itself wouldn't change almost any functionality, so it would be easy to test and deploy), the chroot functionality would be some 15 lines of code.

Interesting.  I'm looking forward to seeing it.

One thing that I would add here, is that we need to extract the SNMP code from master and run it in a separate process as well if we have any hope of making master something that can be allowed to run with any higher privileges than it currently does in its mainloop. Greg explained to me what he had planned for that, but never had time to do it.

https://github.com/cyrusimap/cyrus-imapd/issues/1765


Merry Christmas and Happy New Year!


Thanks, the same to you!

Regards,

Bron.



Anatoli

*From:* Bron Gondwana Via Cyrus-devel
*Sent:* Thursday, December 22, 2016 03:15
*To:* Cyrus Devel, Info Cyrus
*Subject:* Release plan blog post
I posted on the FastMail advent about our plans for releasing Cyrus 3.0 - it's 
a bit roundabout doing it this way rather than here first, but hey - we talked 
about it on Monday night's regular meeting.

Here's the blog post:

https://blog.fastmail.com/2016/12/22/cyrus-development-and-release-plans/

tl;dr, Ellie recently released 3.0beta6.  We're going to do a release candidate 
on Jan 13th and then release for real soon afterwards, so get testing!

There are no major changes expected before release.  I'll be doing a couple of 
small JMAP changes to align with the latest spec and possibly to add 
getMessageListUpdates if I can manage it in time.

Other than that, I'm looking a reverse UniqueId indexing similar to the RACL 
support - it's already in testing and might get added behind a default-off 
config switch.

We'll be assessing all the defaults.  I'm really tempted to turn RACL on, but 
it needs group support if your site uses groups, and that's not done yet, so 
I'd need someone willing to test it!

Bron.





--
  Bron Gondwana
  br...@fastmail.fm



Reply via email to