Re: Release plan blog post

Anatoli via Cyrus-devel Sat, 31 Dec 2016 07:48:24 -0800

Hi Bron, all.

The suggestion proposed by Vladislav for the *Backup* mechanism wouldsimplify the operation and reduce even more the lock time, so I alsolike the idea. Though, I see there 2 possible issues. First, each adminshould ensure the command he/she passes as a param has absolutely no wayto block, under no circumstances. Probably not a big deal for normalconditions, but corner cases should be analyzed.

Then, one may have highly granular privileges for every task, so theprocess that has access to the Cyrus daemon may not have filesystemmodification access. IMO, there is no way (even in theory) to guaranteebug-free development with the current tools and practices, so the onlyfeasible approach to security is compartmentalization anddefense-in-depth principle<https://en.wikipedia.org/wiki/Defense_in_depth_%28computing%29>.

One possible arrangement here would be for a backup script to request alock from the Cyrus daemon and then to signal via some IPC mechanism(could be as simple as creating a file in a special folder) to someother script (that is running with enough privileges for FSmodifications but has no interaction with other components) that it's OKto perform the requested operation (e.g. a snapshot creation).

Even more, once chroot is implemented, cyrus_sync2disk executable (asall other processes) could run inside it and have no way of overseeingthe entire filesystem. So if it has some security issues, it wouldn'taffect the entire system, just the Cyrus daemon. For all this, thelock/unlock interface would be needed.

Anyway, these are just interface implementation details and could beeasily adapted to the needs of the community once the most complex part,the global lock mechanism, is implemented. I'll update the #1763 issuewith this comment and we could continue the discussion there.

Do you have an ETA for this? I can't talk for the entire community, butat least for me this is the most awaited feature.

With respect to "*Small sysadmin tasks*", I'll add the specifics in the#1764 issue.

As to *Push*, I didn't know a perl pusher layer was published, could youplease indicate where it is? What I believe would be enough here is tohave everything up to the point where a web request to APS would beneeded (and IMO we could start with just a single server and wheneverything is working, adapt it for the replication).

Could you please explain the different mechanisms and componentsinvolved in the Apple push mechanism (e.g. the daemon interaction withthe clients (the XAPPLEPUSHSERVICE extension), daemon -> notifydcommunication, notifyd -> perl pusher, perl pusher -> web request toAPS) and give us the current status for each of them (e.g. implementedin the repository, to be implemented, under an NDA but DIY with ourassistance and contribute back, etc.)?

I know that Apple grants access on a case-by-case basis to some"privileged" implementation details (like those needed for OpenVPN iOSclient) under NDAs, but in this case I can't see what functionalityoutside the Cyrus own code could be under an NDA. Is it the web requestfunctionality? Or the XAPPLEPUSHSERVICE dialog with the mail.app (butthis part is already implemented in the v3.0, right)? If I'm not wrong,the push itself (i.e. the mechanism to inform the APS servers about anevent once there's a token from the client and the certs from Apple), isquite well documented, it's even more simple that the PushKit for VoIPapps. Could you please shed some light on this topic (maybe we shouldcreate an issue in github to follow this discussion and to track theadvances)?

With respect to *Security*, sure it's enormous effort to considerablyimprove it in any old project in a single move. A gradual improvementwould be a better approach. I would suggest starting some initiativeslike the document-the-architecture and document-the-code sub-projectsand contribute to them whenever time and other resources permit so theywould be completed as a puzzle, piece by piece. And once some part isdocumented, any change to its code would be first implemented as achange in the corresponding documentation.

There are multiple benefits that greatly outweigh the expected overhead:first a developer would be able to /quickly/ become aware of theimplementation details of the part to be modified (with a side effect ofbetter understanding the implications of the intended change, i.e. lesschances of breaking something and the change itself would be morealigned with the overall architecture). Then, while writing thedocumentation for the intended change, the developer could realize thatthere are better ways of achieving the same objective. And once thechange is implemented, the developer would be able to complement thedocumentation for the undocumented parts with the insights he/she justgained. Some sort of a circular feedback.

Another benefit (depending on the internal organization of the dev team)is that more experienced developers could write the documentation forthe changes (that would hold some relation with the formalspecification), junior members would be those implementing the changesaccording to the documentation and, once ready, the senior developerswould perform code reviews of the modifications - so even unexperienceddevelopers and newcomers in your internal team could activelyparticipate in the project. And of course the community would contributemore as now, without enough understanding of the internals and theoverall architecture, it's a significant effort for an occasionalcontributor to implement any change at all.

Another initiative could be to formally define the security bestpractices and guidelines for the project and to ask everyone to try tofollow them whenever possible. If you don't have anything similar yet,I'll see if I can contribute a draft.

And a security audit, IMO, should be a community-sponsored initiative,as probably no one has enough resources to sponsor it alone. But thereshould be someone starting the initiative ;)

As to the *chroot* implementation, my idea is to document in detail theprocess initialization part (that itself could serve as a base for thedocument-the-architecture/code sub-projects) so everyone who knows itwell could inspect the documentation and make corrections. Once we allagree on the current implementation details, I'd describe the proposedchanges and others (Greg) would be able to contribute their changes too.Again, once we all agree on them, everyone involved would providecorresponding patches. Then we'd repeat the above steps for the actualchroot changes.



Happy New Year!

Regards,
Anatoli

*From:* Bron Gondwana Via Cyrus-devel
*Sent:* Tuesday, December 27, 2016 21:04
*To:* Cyrus-devel
*Subject:* Re: Release plan blog post

Hi,

Sorry for the delay in responding to this - I left it over Christmas soI could sit down without distraction and reply when I was back in theoffice.


On Sat, 24 Dec 2016, at 17:09, Anatoli via Cyrus-devel wrote:

Hi Bron, all.
Thanks for the update and for the support of the project. That's greatwe'll see the 3.0 release soon!
Replying to your last paragraph in the blog post about the communityneeds, I believe that what's good for FM is mostly good for thecommunity too. The FM team is probably the largest operator of theproject and has a better view / face issues and special needs morefrequently than anyone else, so your vision should suit well otherproject users too.
A few areas where I see the FM needs probably don't exactly match theneeds of the community are the following 3.
*1. **Small (SMB) deployments* with a single server and somehowlimited physical resources (e.g. disk space).
Here as an example comes the excellent backup mechanism Ellieimplemented that suits well the needs of medium to large deployments,but IMO that's not the best approach for small deployments, as itrequires a separate server or, if ran at the same server just for thesafe data-to-disk synchronization, twice the disk space.
A better approach for small deployments, as I see it (and I believeit's highly demanded by the community), would be to have an executablethat would instruct Cyrus daemon to synchronize to disk all theinternal structures and lock (stop writing to disk) for a definedperiod. The lock could be implemented by hanging on network writerequests or by writing them to temporary files, or by accumulating thechanges in memory (the latter approach has a potential for data loss).
Once the flush is performed and the lock is applied, a (custom) backupscript could create a snapshot of the partition that would hold theCyrus data in a safe-to-backup state. Immediately after creating thesnapshot, the lock would be released and the daemon would continue itsnormal operation. Then the backup script would be able to safelybackup the data, e.g. create an incremental backup and upload it tosome external storage, then destroy the snapshot.
Usage example: cyrus_sync2disk --lock=5 -> returns 0 when the data issynced and a lock for 5 seconds is obtained. cyrus_sync2disk --unlock-> returns 0 if the lock has been released and 1 if there was noactive lock (e.g a previous lock has expired), so the backup scriptknows if it performed the required operations with the lock still inplace or if it should perform the lock-snapshot-unlock operationagain. The short timeout is to protect the daemon from an infinitelock if a backup script fails to unlock it.

I saw the reponse to this which suggested a "run a command underexclusive lock' which is definitely a better approach to this. Iunderstand what you want here, and I mostly like the idea.

The one thing that gives me pause is that it requires a single lockagainst ALL cyrus processes. Right now, there's no global lock thatprocesses take while making changes, and we'd need to add one. I wouldwant to make it be something that needs to be turned on in config sothat sites which DON'T need it don't have to pay the extra locking cost.

But the design is definitely viable. I want to do some other things withlocking as well, like a single global lock for moves between users,renames, etc - so that we don't have lock ordering issues with those things.


https://github.com/cyrusimap/cyrus-imapd/issues/1763

*2. Small sysadmin tasks* for typical configurations that now requiremanual actions or writing one's own scripts. An example: new mailboxcreation with particular flags (\Sent, \Junk, \Trash) set forspecial-use folders (that could be implemented as an extendedfunctionality of the autocreate_inbox_folders option).
At FM you have everything automated for sure with your own customsscripts, but sysadmins with little experience with Cyrus or those thatdon't write scripts with ease would find some tasks difficult toaccomplish, for others that's just an overhead/additional points offailure that could be avoided with small built-in automations.

This is a definitely interesting area for enhancement. The basic toolhere is cyradm, and I think what we're really looking for is extendingcyradm.


https://github.com/cyrusimap/cyrus-imapd/issues/1764

I'd love some more specific details here, including test plans ideallyso that we can build and test these features. Or pull requests that dothat :)

*3. New deployments* (vs ongoing upgrades/maintenance). How easy andstraightforward it is to setup a new deployment (possibly migratingfrom other email servers). Here I'm referring to both the initialconfiguration, tools and documentation.

Yeah, we know about this one. I'm not going to create a specific bug forit, because it's kind of spread out over lots of different things.Nicola is working on improving our documentation, but again the bestpeople to give advice are people who've recently done it. I haven'treally "installed Cyrus from scratch" for 12 years, certainly notwithout the FastMail configuration and build systems. Except for thetest environment, which has its own special magics.

*Push* is an area that is well implemented at FM, but there's noconsiderable advance in the Cyrus repository, and I believe thecommunity needs in this area are mostly the same as the FM's.
The 3.0 release includes Apple push notifications support(XAPPLEPUSHSERVICE) and that's a good start. I haven't tried it yetand I understand that some effort would be required to make it work(the part that talks to the APS is not included and should beimplemented independently). I do wonder why wouldn't FM share thenotifier code & some documentation about how to make everything work?The only thing that'd be different in each deployment are thecertificates. And it would be really exciting to have working applepush in Cyrus just after some typical setup steps.
If there are some impediments for the FM team to share theirimplementation details on mail and caldav/carddav push notifications,I'll try to make this feature work in my deployments in the nearfuture and contribute to the project a detailed howto and the APSnotifier code (but your assistance would be great).

Unfortunately part of that is under NDA, so we can't offer much moresupport there. When/if Apple open up their push infrastructure more,we'll definitely release the other parts of it.

I'm sure we've published at least part of our perl pusher layer before,though some of the session magic uses our sql infrastructure rather thanstoring sessions in Cyrus so that it survives failover betweenreplicas. If we wanted to store them in Cyrus we'd need to have areplication protocol for key-value stores or some sort of replicated DBstore.

And a general area that would benefit everyone, but that wasn'tspecifically mentioned in the blog post, is *Security*.
I don't mean Cyrus is insecure, and I do know that the FM team paysspecial attention to security of their infrastructure as a whole.Rather I would like to suggest that a special emphasis could be placedon Cyrus security from a development POV, e.g. to document in detail(and keep updated) the entire project's code base and itsarchitecture, to follow most of the security developmentbest-practices, to re-implement with security in mind some old/hackyparts of the system (they would become apparent during thedocumentation phase), to apply general hardening tactics (like chroot)or even to re-engineer the overall architecture for security, toperform internal security code reviews on a regular basis.

This is the kind of well meaning plan that leads down a massive rabbithole. "Document in detail (and keep updated)". Such few words for somuch work. We do bits and pieces of this as we can, and I've recentlyset up coverity to assess the project, and am working my way through itsreports.

Certainly some parts of the code (like sieve) are a fricking mess, andcould very well be hiding security issues because they're just sohorrible. We fix them up as we have time and deal with them.

FM already had a security audit in 2014 (according to your previousblog posts), but you don't specify any details of how deep it was andwhat aspects it covered. Maybe an independent in-depth security auditwith public results just for the Cyrus code base could be sponsored incollaboration with the community?


Again, unfortunately NDAs :(

Feel free to sponsor a security audit. I'd be happy to participate, butI can't justify funding it. I have an idea of where likely bugs are(URLAUTH, FETCH BODY[part] until recently when we rewrote it, maybe evenmessage structure parsing) and I rewrite them to be safer when I dealwith those bits of code, as do the rest of the team.

As for me as a member of the community, I have an intention toimplement the chroot functionality for the daemon (late chroot like inOpenVPN). I've already discussed it briefly with Ellie and was hopingto make it ready for the 3.0 release, but had no time for it yet. Toimplement it correctly, first some important changes should be appliedto the initialization logic (the moment of dropping the privs, itshould be inside newly started processes, rather than in the master).This change should be carefully analyzed and it's a significanteffort, I hope to be able to contribute it during the Q1/17. Once thischange is implemented (which in itself wouldn't change almost anyfunctionality, so it would be easy to test and deploy), the chrootfunctionality would be some 15 lines of code.


Interesting.  I'm looking forward to seeing it.

One thing that I would add here, is that we need to extract the SNMPcode from master and run it in a separate process as well if we have anyhope of making master something that can be allowed to run with anyhigher privileges than it currently does in its mainloop. Gregexplained to me what he had planned for that, but never had time to do it.


https://github.com/cyrusimap/cyrus-imapd/issues/1765


Merry Christmas and Happy New Year!


Thanks, the same to you!

Regards,

Bron.



Anatoli

*From:* Bron Gondwana Via Cyrus-devel
*Sent:* Thursday, December 22, 2016 03:15
*To:* Cyrus Devel, Info Cyrus
*Subject:* Release plan blog post

I posted on the FastMail advent about our plans for releasing Cyrus 3.0 - it's 
a bit roundabout doing it this way rather than here first, but hey - we talked 
about it on Monday night's regular meeting.

Here's the blog post:

https://blog.fastmail.com/2016/12/22/cyrus-development-and-release-plans/

tl;dr, Ellie recently released 3.0beta6.  We're going to do a release candidate 
on Jan 13th and then release for real soon afterwards, so get testing!

There are no major changes expected before release.  I'll be doing a couple of 
small JMAP changes to align with the latest spec and possibly to add 
getMessageListUpdates if I can manage it in time.

Other than that, I'm looking a reverse UniqueId indexing similar to the RACL 
support - it's already in testing and might get added behind a default-off 
config switch.

We'll be assessing all the defaults.  I'm really tempted to turn RACL on, but 
it needs group support if your site uses groups, and that's not done yet, so 
I'd need someone willing to test it!

Bron.


--
  Bron Gondwana
  [email protected]

Re: Release plan blog post

Reply via email to