Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting
On 9/9/14, 12:03 PM, Monty Taylor wrote: On 09/04/2014 01:30 AM, Clint Byrum wrote: Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700: Greetings, Last Tuesday the TC held the first graduation review for Zaqar. During the meeting some concerns arose. I've listed those concerns below with some comments hoping that it will help starting a discussion before the next meeting. In addition, I've added some comments about the project stability at the bottom and an etherpad link pointing to a list of use cases for Zaqar. Hi Flavio. This was an interesting read. As somebody whose attention has recently been drawn to Zaqar, I am quite interested in seeing it graduate. # Concerns - Concern on operational burden of requiring NoSQL deploy expertise to the mix of openstack operational skills For those of you not familiar with Zaqar, it currently supports 2 nosql drivers - MongoDB and Redis - and those are the only 2 drivers it supports for now. This will require operators willing to use Zaqar to maintain a new (?) NoSQL technology in their system. Before expressing our thoughts on this matter, let me say that: 1. By removing the SQLAlchemy driver, we basically removed the chance for operators to use an already deployed OpenStack-technology 2. Zaqar won't be backed by any AMQP based messaging technology for now. Here's[0] a summary of the research the team (mostly done by Victoria) did during Juno 3. We (OpenStack) used to require Redis for the zmq matchmaker 4. We (OpenStack) also use memcached for caching and as the oslo caching lib becomes available - or a wrapper on top of dogpile.cache - Redis may be used in place of memcached in more and more deployments. 5. Ceilometer's recommended storage driver is still MongoDB, although Ceilometer has now support for sqlalchemy. (Please correct me if I'm wrong). That being said, it's obvious we already, to some extent, promote some NoSQL technologies. However, for the sake of the discussion, lets assume we don't. I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't keep avoiding these technologies. NoSQL technologies have been around for years and we should be prepared - including OpenStack operators - to support these technologies. Not every tool is good for all tasks - one of the reasons we removed the sqlalchemy driver in the first place - therefore it's impossible to keep an homogeneous environment for all services. I whole heartedly agree that non traditional storage technologies that are becoming mainstream are good candidates for use cases where SQL based storage gets in the way. I wish there wasn't so much FUD (warranted or not) about MongoDB, but that is the reality we live in. With this, I'm not suggesting to ignore the risks and the extra burden this adds but, instead of attempting to avoid it completely by not evolving the stack of services we provide, we should probably work on defining a reasonable subset of NoSQL services we are OK with supporting. This will help making the burden smaller and it'll give operators the option to choose. [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/ - Concern on should we really reinvent a queue system rather than piggyback on one As mentioned in the meeting on Tuesday, Zaqar is not reinventing message brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack flavor on top. [0] I think Zaqar is more like SMTP and IMAP than AMQP. You're not really trying to connect two processes in real time. You're trying to do fully asynchronous messaging with fully randomized access to any message. Perhaps somebody should explore whether the approaches taken by large scale IMAP providers could be applied to Zaqar. Anyway, I can't imagine writing a system to intentionally use the semantics of IMAP and SMTP. I'd be very interested in seeing actual use cases for it, apologies if those have been posted before. It seems like you're EITHER describing something called XMPP that has at least one open source scalable backend called ejabberd. OR, you've actually hit the nail on the head with bringing up SMTP and IMAP but for some reason that feels strange. SMTP and IMAP already implement every feature you've described, as well as retries/failover/HA and a fully end to end secure transport (if installed properly) If you don't actually set them up to run as a public messaging interface but just as a cloud-local exchange, then you could get by with very low overhead for a massive throughput - it can very easily be run on a single machine for Sean's simplicity, and could just as easily be scaled out using well known techniques for public cloud sized deployments? So why not use existing daemons that do this? You could still use the REST API you've got, but instead of writing it to a mongo backend and trying to implement all of the things that already exist in SMTP/IMAP - you could just have them front to it. You could even bypass normal
Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting
On 9/9/14, 4:47 PM, Devananda van der Veen wrote: On Tue, Sep 9, 2014 at 4:12 PM, Samuel Merritt s...@swiftstack.com wrote: On 9/9/14, 12:03 PM, Monty Taylor wrote: [snip] So which is it? Because it sounds like to me it's a thing that actually does NOT need to diverge in technology in any way, but that I've been told that it needs to diverge because it's delivering a different set of features - and I'm pretty sure if it _is_ the thing that needs to diverge in technology because of its feature set, then it's a thing I don't think we should be implementing in python in OpenStack because it already exists and it's called AMQP. Whether Zaqar is more like AMQP or more like email is a really strange metric to use for considering its inclusion. I don't find this strange at all -- I had been judging the technical merits of Zaqar (ex-Marconi) for the last ~18 months based on the understanding that it aimed to provide Queueing-as-a-Service, and found its delivery of that to be lacking on technical grounds. The implementation did not meet my view of what a queue service should provide; it is based on some serious antipatterns (storing a queue in an RDBMS is probably the most obvious); and in fact, it isn't even queue-like in the access patterns enabled by the REST API (random access to a set != a queue). That was the basis for a large part of my objections to the project over time, and a source of frustration for me as the developers justified many of their positions rather than accepted feedback and changed course during the incubation period. The reason for this seems clear now... As was pointed out in the TC meeting today, Zaqar is (was?) actually aiming to provide Messaging-as-a-Service -- not queueing as a service! This is another way of saying it's more like email and less like AMQP, which means my but-its-not-a-queue objection to the project's graduation is irrelevant, and I need to rethink about all my previous assessments of the project. The questions now before us are: - should OpenStack include, in the integrated release, a messaging-as-a-service component? I certainly think so. I've worked on a few reasonable-scale web applications, and they all followed the same pattern: HTTP app servers serving requests quickly, background workers for long-running tasks, and some sort of durable message-broker/queue-server thing for conveying work from the first to the second. A quick straw poll of my nearby coworkers shows that every non-trivial web application that they've worked on in the last decade follows the same pattern. While not *every* application needs such a thing, web apps are quite common these days, and Zaqar satisfies one of their big requirements. Not only that, it does so in a way that requires much less babysitting than run-your-own-broker does. - is Zaqar a technically sound implementation of such a service? As an aside, there are still references to Zaqar as a queue in both the wiki [0], in the governance repo [1], and on launchpad [2]. Regards, Devananda [0] Multi-tenant queues based on Keystone project IDs https://wiki.openstack.org/wiki/Zaqar#Key_features [1] Queue service is even the official OpenStack Program name, and the mission statement starts with To produce an OpenStack message queueing API and service. http://git.openstack.org/cgit/openstack/governance/tree/reference/programs.yaml#n315 [2] Zaqar is a new OpenStack project to create a multi-tenant cloud queuing service https://launchpad.net/zaqar ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Bogus -1 scores from turbo hipster
On 1/7/14 2:53 PM, Michael Still wrote: Hi. Thanks for reaching out about this. It seems this patch has now passed turbo hipster, so I am going to treat this as a more theoretical question than perhaps you intended. I should note though that Joshua Hesketh and I have been trying to read / triage every turbo hipster failure, but that has been hard this week because we're both at a conference. The problem this patch faced is that we are having trouble defining what is a reasonable amount of time for a database migration to run for. Specifically: 2014-01-07 14:59:32,012 [output] 205 - 206... 2014-01-07 14:59:32,848 [heartbeat] 2014-01-07 15:00:02,848 [heartbeat] 2014-01-07 15:00:32,849 [heartbeat] 2014-01-07 15:00:39,197 [output] done So applying migration 206 took slightly over a minute (67 seconds). Our historical data (mean + 2 standard deviations) says that this migration should take no more than 63 seconds. So this only just failed the test. It seems to me that requiring a runtime less than (mean + 2 stddev) leads to a false-positive rate of 1 in 40, right? If the runtimes have a normal(-ish) distribution, then 95% of them will be within 2 standard deviations of the mean, so that's 1 in 20 falling outside that range. Then discard the ones that are faster than (mean - 2 stddev), and that leaves 1 in 40. Please correct me if I'm wrong; I'm no statistician. Such a high false-positive may make it too easy to ignore turbo hipster as the bot that cried wolf. This problem already exists with Jenkins and the devstack/tempest tests; when one of those fails, I don't wonder what I broke, but rather how many times I'll have to recheck the patch until the tests pass. Unfortunately, I don't have a solution to offer, but perhaps someone else will. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Swift] domain-level quotas
On 1/23/14 1:46 AM, Matthieu Huin wrote: Hello Christian, - Original Message - From: Christian Schwede christian.schw...@enovance.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org, Matthieu Huin matthieu.h...@enovance.com Sent: Wednesday, January 22, 2014 10:47:24 PM Subject: Re: [openstack-dev] [Swift] domain-level quotas Hi Matthieu, Am 22.01.14 20:02, schrieb Matthieu Huin: The idea is to have a middleware checking a domain's current usage against a limit set in the configuration before allowing an upload. The domain id can be extracted from the token, then used to query keystone for a list of projects belonging to the domain. Swift would then compute the domain usage in a similar fashion as the way it is currently done for accounts, and proceed from there. the problem might be to compute the current usage of all accounts within a domain. It won't be a problem if you have only a few accounts in a domain, but with tens, hundreds or even thousands accounts in a domain there will be a performance impact because you need to iterate over all accounts (doing a HEAD on every account) and sum up the total usage. One might object that this is already a concern when applying quotas to potentially huge accounts with lots of containers, although I agree that domains add an order of magnitude to this problem. Swift accounts and containers keep track* of the total object count and size, so account/container quotas need only perform a single HEAD request to get the current usage. The number of containers per account or objects per container doesn't affect the speed with which the quota check runs. Since domains don't map directly to a single entity in Swift, getting the usage for a domain requires making O(N) requests to fetch the individual usage data. Domain quotas wouldn't just make usage checks an order of magnitude more costly; they'd take it from roughly constant to potentially unbounded. * subject to eventual consistency, of course ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [swift] what does swift do if the auditor find that all 3 replicas are corrupt?
On 11/6/13 7:12 AM, Daniel Li wrote: Hi, I have a question about swift: what does swift do if the auditor find that all 3 replicas are corrupt? will it notify the owner of the object(email to the account owner)? what will happen if the GET request to the corrupted object? will it return a special error telling that all the replicas are corrupted? Or will it just say that the object is not exist? Or it just return one of the corrupted replica? Or something else? If all 3 (or N) replicas are corrupt, then the auditors will eventually quarantine all of them, and subsequent GET requests will receive 404 responses. No notifications are sent, nor is it really feasible to start sending them. The auditor is not a single process; there is one Swift auditor process running on each node in a cluster. Therefore, when an object is quarantined, there's no way for its auditor to know if the other copies are okay or not. Note that this is highly unlikely to ever happen, at least with the default of 3 replicas. When an auditor finds a corrupt object, it quarantines it (moves it to a quarantines directory). Then, since that object is missing, the replication processes will recreate the object by copying it from a node with a good copy. You'd need to have all replicas become corrupt within a very short timespan so that the replicators don't get a chance to replace the damaged ones. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [swift] what does swift do if the auditor find that all 3 replicas are corrupt?
On 11/7/13 5:59 AM, Daniel Li wrote: Thanks very much for your help, and please see my inline comments/questions. On Thu, Nov 7, 2013 at 2:30 AM, Samuel Merritt s...@swiftstack.com mailto:s...@swiftstack.com wrote: On 11/6/13 7:12 AM, Daniel Li wrote: Hi, I have a question about swift: what does swift do if the auditor find that all 3 replicas are corrupt? will it notify the owner of the object(email to the account owner)? what will happen if the GET request to the corrupted object? will it return a special error telling that all the replicas are corrupted? Or will it just say that the object is not exist? Or it just return one of the corrupted replica? Or something else? If all 3 (or N) replicas are corrupt, then the auditors will eventually quarantine all of them, and subsequent GET requests will receive 404 responses. No notifications are sent, nor is it really feasible to start sending them. The auditor is not a single process; there is one Swift auditor process running on each node in a cluster. Therefore, when an object is quarantined, there's no way for its auditor to know if the other copies are okay or not. Note that this is highly unlikely to ever happen, at least with the default of 3 replicas. When an auditor finds a corrupt object, it quarantines it (moves it to a quarantines directory). Did you mean that when the auditor found the corruption, it did not copy good replica from other object server to overwrite the corrupted one, it just moved it to a quarantines directory? That is correct. The object auditors don't perform any network IO, and in fact do not use the ring at all. All they do is scan the filesystems and quarantine bad objects in an infinite loop. (Of course, there are also container and account auditors that do similar things, but for container and account databases.) Then, since that object is missing, the replication processes will recreate the object by copying it from a node with a good copy. When did the replication processes recreated the object by copying it from a node with a good copy? Does the auditor send a message to replication so the replication will do the copy immediately? And what is a 'good' copy? Does the good copy's MD5 value is checked before copying? It'll happen whenever the other replicators, which are running on other nodes, get around to it. Replication in Swift is push-based, not pull-based; there is no receiver here to which a message could be sent. Currently, a good copy is one that hasn't been quarantined. Since replication uses rsync to push files around the network, there's no checking of MD5 at copy time. However, there is work underway to develop a replication protocol that avoids rsync entirely and uses the object server throughout the entire replication process, and that would give the object server a chance to check MD5 checksums on incoming writes. Note that this is only important should 2 replicas experience near-simultaneous bitrot; in that case, there is a chance that bad-copy A will get quarantined and replaced with bad-copy B. Eventually, though, a bad copy will get quarantined and replaced with a good copy, and then you've got 2 good copies and 1 bad one, which reduces to a previously-discussed scenario. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [gate] The gate: a failure analysis
On 7/21/14, 3:38 AM, Matthew Booth wrote: [snip] I would like to make the radical proposal that we stop gating on CI failures. We will continue to run them on every change, but only after the change has been successfully merged. Benefits: * Without rechecks, the gate will use 8 times fewer resources. * Log analysis is still available to indicate the emergence of races. * Fixes can be merged quicker. * Vastly less developer time spent monitoring gate failures. Costs: * A rare class of merge bug will make it into master. Note that the benefits above will also offset the cost of resolving this rare class of merge bug. I think this is definitely a move in the right direction, but I'd like to propose a slight modification: let's cease blocking changes on *known* CI failures. More precisely, if Elastic Recheck knows about all the failures that happened on a test run, treat that test run as successful. I think this will gain virtually all the benefits you name while still retaining most of the gate's ability to keep breaking changes out. As a bonus, it'll encourage people to make Elastic Recheck better. Currently, the easy path is to just type recheck no bug and click submit; it takes a lot less time than scrutinizing log files to guess at what went wrong. If failures identified by E-R don't block developers' changes, then the easy path is to improve E-R's checks, which benefits everyone. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [swift] - question about statsd messages and 404 errors
On 7/25/14, 4:58 AM, Seger, Mark (Cloud Services) wrote: I’m trying to track object server GET errors using statsd and I’m not seeing them. The test I’m doing is to simply do a GET on an non-existent object. As expected, a 404 is returned and the object server log records it. However, statsd implies it succeeded because there were no errors reported. A read of the admin guide does clearly say the GET timing includes failed GETs, but my question then becomes how is one to tell there was a failure? Should there be another type of message that DOES report errors? Or how about including these in the ‘object-server.GET.errors.timing’ message? What error means with respect to Swift's backend-server timing metrics is pretty fuzzy at the moment, and could probably use some work. The idea is that object-server.GET.timing has timing data for everything that Swift handled successfully, and object-server.GET.timing.errors has timing data for things where Swift failed. Some things are pretty easy to divide up. For example, 200-series status code always counts as success, and 500-series status code always counts as error. It gets tricky in the 400-series status codes. For example, a 404 means that a client asked for an object that doesn't exist. That's not Swift's fault, so that goes into the success bucket (object-server.GET.timing). Similarly, a 412 means that a client set an unsatisfiable precondition in the If-Match, If-None-Match, If-Modified-Since, or If-Unmodified-Since headers, and Swift correctly determined that the requested object can't fulfill the precondition, so that one goes in the success bucket too. However, there are other status codes that are more ambiguous. Consider 409; the object server responds with 409 if the request's X-Timestamp is less than the object's X-Timestamp (on PUT/POST/DELETE). You can get this with two near-simultaneous POSTs: 1. request A hits proxy; proxy assigns X-Timestamp: 1406316223.851131 2. request B hits proxy; proxy assigns X-Timestamp: 1406316223.851132 3. request B hits object server and gets 202 4. request A hits object server and gets 409 Does that error count as Swift's fault? If the client requests were nearly simultaneous, then I think not; there's always going to be *some* delay between accept() and gettimeofday(). On the other hand, if one proxy server's time is significantly behind another's, then it is Swift's fault. It's even worse with 400; sometimes it's for bad paths (like asking an object server for /partition/account/container; this can happen if the administrator misconfigures their rings), and sometimes it's for bad X-Delete-At / X-Delete-After values (which are set by the client). I'm not sure what the best way to fix this is, but if you just want to see some error metrics, unmount a disk to get some 507s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer] [swift] Improving ceilometer.objectstore.swift_middleware
On 7/30/14, 8:06 AM, Chris Dent wrote: ceilometer/objectstore/swift_middleware.py[1] counts the size of web request and reponse bodies through the swift proxy server and publishes metrics of the size of the request and response and that a request happened at all. There are (at least) two bug reports associated with this bit of code: * avoid requirement on tarball for unit tests https://bugs.launchpad.net/ceilometer/+bug/1285388 * significant performance degradation when ceilometer middleware for swift proxy uses https://bugs.launchpad.net/ceilometer/+bug/1337761 [snip] Some options appear to be: * Move the middleware to swift or move the functionality to swift. In the process make the functionality drop generic notifications for storage.objects.incoming.bytes and storage.objects.outgoing.bytes that anyone can consume, including ceilometer. This could potentially address both bugs. * Move or copy swift.common.utils.{InputProxy,split_path} to somewhere in oslo, but keep the middleware in ceilometer. This would require somebody sharing the info on how to properly participate in swift's logging setup without incorporating swift. This would fix the first bug without saying anything about the second. * Carry on importing the swift tarball or otherwise depending on swift. Fixes neither bug, maintains status quo. What are other options? Of those above which are best or most realistic? Swift is already emitting those numbers[1] in statsd format; could ceilometer consume those metrics and convert them to whatever notification format it uses? When configured to log to statsd, the Swift proxy will emit metrics of the form proxy-server.type.verb.status.xfer; for example, a successful object download would have a metric name of proxy-server.object.GET.200.xfer and a value of the number of bytes downloaded. Similarly, PUTs would look like proxy-server.object.PUT.2xx.xfer. If ceilometer were to consume these metrics in a process outside the Swift proxy server, this would solve both problems. The performance fix comes by being outside the Swift proxy, and consuming statsd metrics can be done without pulling in Swift code[2]. [1] http://docs.openstack.org/developer/swift/admin_guide.html#reporting-metrics-to-statsd [2] e.g. pystatsd ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer] [swift] Improving ceilometer.objectstore.swift_middleware
On 7/31/14, 1:06 AM, Eoghan Glynn wrote: Swift is already emitting those numbers[1] in statsd format; could ceilometer consume those metrics and convert them to whatever notification format it uses? The problem with that approach, IIUC, is that the statsd metrics provide insufficient context. Ceilometer wants to meter usage on a per-user per-tenant basis, so captures[1] the http-x-user-id and http-x-tenant-id headers from the incoming request for this purpose. Similarly, the resource-id is fabricated from the swift account. I don't think this supporting contextual info would be available from raw statsd metrics, or? Good point. Adding per-user and per-tenant fields to the statsd metrics is the wrong way to go on a couple of levels. First, it would leak Keystone-isms into the core Swift code, which is at odds with Swift having pluggable auth systems. Second, it would immediately wreck anyone who's got the statds metrics flowing into Graphite, as suddenly there'd be lots of new metrics for every single tenant/user pair, which would rapidly fill up the Graphite system's disks until it fell over. I think your suggestion elsewhere in the thread of combining multiple API calls into a single notification is a better way to go. That'll certainly result in less client-visible slowdown from sending notifications, particularly if the notifications are sent in a separate greenthread. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Cross distribution talks on Friday
On 11/1/14, 3:51 PM, Alan Pevec wrote: %install export OSLO_PACKAGE_VERSION=%{version} %{__python} setup.py install -O1 --skip-build --root %{buildroot} Then everything should be ok and PBR will become your friend. Still not my friend because I don't want a _build_ tool as runtime dependency :) e.g. you don't ship make(1) to run C programs, do you? For runtime, only pbr.version part is required but unfortunately oslo.version was abandoned. Swift has an elegant* solution** to this problem that makes PBR into a build-time-only dependency. Take a look at the top-level __init__.py in the Swift source tree: https://github.com/openstack/swift/blob/709187b54ff2e9b81ac53977d4283523ce16af38/swift/__init__.py * kind of ugly ** hack ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] LTFS integration with OpenStack Swift for scenario like - Data Archival as a Service .
On 11/13/14, 10:19 PM, Sachin Goswami wrote: In OpenStack Swift - xfs file system is integrated which provides a maximum file system size of 8 exbibytes minus one byte (263-1 bytes). Not exactly. The Swift storage nodes keep their data on POSIX filesystems with support for extended attributes. While XFS filesystems are typically used, XFS is not required. We are studying use of LTFS integration with OpenStack Swift for scenario like - *Data Archival as a Service* . Was integration of LTFS with Swift considered before? If so, can you please share your study output? Will integration of LTFS with Swift fit into existing Swift architecture ? Assuming it's POSIX enough and supports extended attributes, a tape filesystem on a spinning disk might technically work, but I don't see it performing well at all. If you're talking about using actual tapes for data storage, I can't imagine that working out for you. Most clients aren't prepared to wait multiple minutes for HTTP responses while a tape laboriously spins back and forth, so they'll just time out. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Defining API Success in OpenStack APIs (specifically Swift)
On 6/20/13 4:21 AM, Sean Dague wrote: The following patch review came into Tempest yesterday to stop checking for specific 20x codes on a number of Swift API - https://review.openstack.org/#/c/33689/ The official documentation for these APIs says the following - http://docs.openstack.org/api/openstack-object-storage/1.0/content/retrieve-account-metadata.html The HTTP return code will be 2xx (between 200 and 299, inclusive) if the request succeeds This seems kind of broken to me that that's the contract provided. I've got a -1 on the patch right now, but I think this is worth raising for broader discussion. It seems to go somewhat contrary to https://wiki.openstack.org/wiki/APIChangeGuidelines and to the spirit of having stable, well defined interfaces. So I guess I open up the question of is it ok for OpenStack core projects to not commit to success codes for API calls? If so, we'll let the test change into Tempest. If not, we probably need to call that out on API standards. I think that's really two separate questions. There's the question of what new APIs should be, but there's also the question of what existing APIs are. IMHO, it's entirely reasonable to have guidelines or rules for new APIs, but to go back and retroactively impose new standards on old APIs is too much, especially when it's done without even consulting that project's developers. Remember, Swift predates not only the OpenStack API Change Guidelines mentioned above, but it also predates OpenStack, and it's only ever had one API version. If an old API isn't up to new standards, that's just something to grandfather in. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [keystone] SPFE: Authenticated Encryption (AE) Tokens
On 2/14/15 9:49 PM, Adam Young wrote: On 02/13/2015 04:19 PM, Morgan Fainberg wrote: On February 13, 2015 at 11:51:10 AM, Lance Bragstad (lbrags...@gmail.com mailto:lbrags...@gmail.com) wrote: Hello all, I'm proposing the Authenticated Encryption (AE) Token specification [1] as an SPFE. AE tokens increases scalability of Keystone by removing token persistence. This provider has been discussed prior to, and at the Paris summit [2]. There is an implementation that is currently up for review [3], that was built off a POC. Based on the POC, there has been some performance analysis done with respect to the token formats available in Keystone (UUID, PKI, PKIZ, AE) [4]. The Keystone team spent some time discussing limitations of the current POC implementation at the mid-cycle. One case that still needs to be addressed (and is currently being worked), is federated tokens. When requesting unscoped federated tokens, the token contains unbound groups which would need to be carried in the token. This case can be handled by AE tokens but it would be possible for an unscoped federated AE token to exceed an acceptable AE token length (i.e. 255 characters). Long story short, a federation migration could be used to ensure federated AE tokens never exceed a certain length. Feel free to leave your comments on the AE Token spec. Thanks! Lance [1] https://review.openstack.org/#/c/130050/ [2] https://etherpad.openstack.org/p/kilo-keystone-authorization [3] https://review.openstack.org/#/c/145317/ [4] http://dolphm.com/benchmarking-openstack-keystone-token-formats/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I am for granting this exception as long as it’s clear that the following is clear/true: * All current use-cases for tokens (including federation) will be supported by the new token provider. * The federation tokens being possibly over 255 characters can be addressed in the future if they are not addressed here (a “federation migration” does not clearly state what is meant. I think the length of the token is not a real issue. We need to keep them within header lengths. That is 8k. Anything smaller than that will work. I'd like to respectfully disagree here. Large tokens can dramatically increase the overhead for users of Swift with small objects since the token must be passed along with every request. For example, I have a small static web site: 68 files, mean file size 5508 bytes, median 636 bytes, total 374517 bytes. (It's an actual site; these are genuine data.) If I upload these things to Swift using a UUID token, then I incur maybe 400 bytes of overhead per file in the HTTP request, which is a 7.3% bloat. On the other hand, if the token + other headers is 8K, then I'm looking at 149% bloat, so I've more than doubled my transfer requirements just from tokens. :/ I think that, for users of Swift and any other OpenStack data-plane APIs, token size is a definite concern. I am very much in favor of anything that shrinks token sizes while keeping the scalability benefits of PKI tokens. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [keystone] SPFE: Authenticated Encryption (AE) Tokens
On 2/16/15 11:48 AM, Lance Bragstad wrote: On Mon, Feb 16, 2015 at 1:21 PM, Samuel Merritt s...@swiftstack.com mailto:s...@swiftstack.com wrote: On 2/14/15 9:49 PM, Adam Young wrote: On 02/13/2015 04:19 PM, Morgan Fainberg wrote: On February 13, 2015 at 11:51:10 AM, Lance Bragstad (lbrags...@gmail.com mailto:lbrags...@gmail.com mailto:lbrags...@gmail.com mailto:lbrags...@gmail.com) wrote: Hello all, I'm proposing the Authenticated Encryption (AE) Token specification [1] as an SPFE. AE tokens increases scalability of Keystone by removing token persistence. This provider has been discussed prior to, and at the Paris summit [2]. There is an implementation that is currently up for review [3], that was built off a POC. Based on the POC, there has been some performance analysis done with respect to the token formats available in Keystone (UUID, PKI, PKIZ, AE) [4]. The Keystone team spent some time discussing limitations of the current POC implementation at the mid-cycle. One case that still needs to be addressed (and is currently being worked), is federated tokens. When requesting unscoped federated tokens, the token contains unbound groups which would need to be carried in the token. This case can be handled by AE tokens but it would be possible for an unscoped federated AE token to exceed an acceptable AE token length (i.e. 255 characters). Long story short, a federation migration could be used to ensure federated AE tokens never exceed a certain length. Feel free to leave your comments on the AE Token spec. Thanks! Lance [1] https://review.openstack.org/#__/c/130050/ https://review.openstack.org/#/c/130050/ [2] https://etherpad.openstack.__org/p/kilo-keystone-__authorization https://etherpad.openstack.org/p/kilo-keystone-authorization [3] https://review.openstack.org/#__/c/145317/ https://review.openstack.org/#/c/145317/ [4] http://dolphm.com/__benchmarking-openstack-__keystone-token-formats/ http://dolphm.com/benchmarking-openstack-keystone-token-formats/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: OpenStack-dev-request@lists.__openstack.org?subject:__unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I am for granting this exception as long as it’s clear that the following is clear/true: * All current use-cases for tokens (including federation) will be supported by the new token provider. * The federation tokens being possibly over 255 characters can be addressed in the future if they are not addressed here (a “federation migration” does not clearly state what is meant. I think the length of the token is not a real issue. We need to keep them within header lengths. That is 8k. Anything smaller than that will work. I'd like to respectfully disagree here. Large tokens can dramatically increase the overhead for users of Swift with small objects since the token must be passed along with every request. For example, I have a small static web site: 68 files, mean file size 5508 bytes, median 636 bytes, total 374517 bytes. (It's an actual site; these are genuine data.) If I upload these things to Swift using a UUID token, then I incur maybe 400 bytes of overhead per file in the HTTP request, which is a 7.3% bloat. On the other hand, if the token + other headers is 8K, then I'm looking at 149% bloat, so I've more than doubled my transfer requirements just from tokens. :/ I think that, for users of Swift and any other OpenStack data-plane APIs, token size is a definite concern. I am very much in favor of anything that shrinks token sizes while keeping the scalability benefits of PKI tokens. Ideally, what's
Re: [openstack-dev] Reasoning behind my vote on the Go topic
On 6/7/16 12:00 PM, Monty Taylor wrote: [snip] > I'd rather see us focus energy on Python3, asyncio and its pluggable event loops. The work in: http://magic.io/blog/uvloop-blazing-fast-python-networking/ is a great indication in an actual apples-to-apples comparison of what can be accomplished in python doing IO-bound activities by using modern Python techniques. I think that comparing python2+eventlet to a fresh rewrite in Go isn't 100% of the story. A TON of work has gone in to Python that we're not taking advantage of because we're still supporting Python2. So what I've love to see in the realm of comparative experimentation is to see if the existing Python we already have can be leveraged as we adopt newer and more modern things. Asyncio, eventlet, and other similar libraries are all very good for performing asynchronous IO on sockets and pipes. However, none of them help for filesystem IO. That's why Swift needs a golang object server: the go runtime will keep some goroutines running even though some other goroutines are performing filesystem IO, whereas filesystem IO in Python blocks the entire process, asyncio or no asyncio. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [swift] Can swift identify user agent come from chrome browser?
On 3/17/16 1:53 AM, Linpeimin wrote: Hello, everyone. I have config a web server (tengine) as a proxy server for swift, and sent a GET request via a chrome browser in order to access swift container. From the log file, it can be seen that web server has pass the request to swift, but swift returns an unauthorized error. Log file record like this: Access logs of *tengine:* 10.74.167.183 - - [17/Mar/2016:16:30:03 +] "GET /auth/v1.0 HTTP/1.1" 401 131 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36" "-" 10.74.167.183 - - [17/Mar/2016:16:30:03 +] "GET /favicon.ico HTTP/1.1" 401 649 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36" "-" Proxy logs of *swift*: Mar 17 15:12:27 localhost journal: proxy-logging 10.74.167.183 192.168.1.5 17/Mar/2016/15/12/27 GET /auth/v1.0 HTTP/1.0 401 - Mozilla/5.0%20%28Windows%20NT%206.1%3B%20WOW64%29%20AppleWebKit/537.36%20%28KHTML%2C%20like%20Gecko%29%20Chrome/28.0.1500.72%20Safari/537.36 - - 131 - tx21863381504d47098a73846d621fcbd0 - 0.0003 - Mar 17 15:12:27 localhost journal: tempauth 10.74.167.183 192.168.1.5 17/Mar/2016/15/12/27 GET /auth/v1.0 HTTP/1.0 401 - Mozilla/5.0%20%28Windows%20NT%206.1%3B%20WOW64%29%20AppleWebKit/537.36%20%28KHTML%2C%20like%20Gecko%29%20Chrome/28.0.1500.72%20Safari/537.36 - - - - tx21863381504d47098a73846d621fcbd0 - 0.0013 It's the same value, just URL-encoded. Swift's access log is formatted as one record per line, with fields delimited by spaces. Since the user-agent string may contain spaces, it's escaped before logging so that the log formatting isn't broken. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tc] supporting Go
On 5/9/16 5:21 PM, Robert Collins wrote: On 10 May 2016 at 10:54, John Dickinsonwrote: On 9 May 2016, at 13:16, Gregory Haynes wrote: This is a bit of an aside but I am sure others are wondering the same thing - Is there some info (specs/etherpad/ML thread/etc) that has more details on the bottleneck you're running in to? Given that the only clients of your service are the public facing DNS servers I am now even more surprised that you're hitting a python-inherent bottleneck. In Swift's case, the summary is that it's hard[0] to write a network service in Python that shuffles data between the network and a block device (hard drive) and effectively utilizes all of the hardware available. So far, we've done very well by fork()'ing child processes, ... Initial results from a golang reimplementation of the object server in Python are very positive[1]. We're not proposing to rewrite Swift entirely in Golang. Specifically, we're looking at improving object replication time in Swift. This service must discover what data is on a drive, talk to other servers in the cluster about what they have, and coordinate any data sync process that's needed. [0] Hard, not impossible. Of course, given enough time, we can do anything in a Turing-complete language, right? But we're not talking about possible, we're talking about efficient tools for the job at hand. ... I'm glad you're finding you can get good results in (presumably) clean, understandable code. Given go's historically poor perfornance with multiple cores (https://golang.org/doc/faq#Why_GOMAXPROCS) I'm going to presume the major advantage is in the CSP programming model - something that Twisted does very well: and frustratingly we've had numerous discussions from folk in the Twisted world who see the pain we have and want to help, but as a community we've consistently stayed with eventlet, which has a threaded programming model - and threaded models are poorly suited for the case here. At its core, the problem is that filesystem IO can take a surprisingly long time, during which the calling thread/process is blocked, and there's no good asynchronous alternative. Some background: With Eventlet, when your greenthread tries to read from a socket and the socket is not readable, then recvfrom() returns -1/EWOULDBLOCK; then, the Eventlet hub steps in, unschedules your greenthread, finds an unblocked one, and lets it proceed. It's pretty good at servicing a bunch of concurrent connections and keeping the CPU busy. On the other hand, when the socket is readable, then recvfrom() returns quickly (a few microseconds). The calling process was technically blocked, but the syscall is so fast that it hardly matters. Now, when your greenthread tries to read from a file, that read() call doesn't return until the data is in your process's memory. This can take a surprisingly long time. If the data isn't in buffer cache and the kernel has to go fetch it from a spinning disk, then you're looking at a seek time of ~7 ms, and that's assuming there are no other pending requests for the disk. There's no EWOULDBLOCK when reading from a plain file, either. If the file pointer isn't at EOF, then the calling process blocks until the kernel fetches data for it. Back to Swift: The Swift object server basically does two things: it either reads from a disk and writes to a socket or vice versa. There's a little HTTP parsing in there, but the vast majority of the work is shuffling bytes between network and disk. One Swift object server can service many clients simultaneously. The problem is those pauses due to read(). If your process is servicing hundreds of clients reading from and writing to dozens of disks (in, say, a 48-disk 4U server), then all those little 7 ms waits are pretty bad for throughput. Now, a lot of the time, the kernel does some readahead so your read() calls can quickly return data from buffer cache, but there are still lots of little hitches. But wait: it gets worse. Sometimes a disk gets slow. Maybe it's got a lot of pending IO requests, maybe its filesystem is getting close to full, or maybe the disk hardware is just starting to get flaky. For whatever reason, IO to this disk starts taking a lot longer than 7 ms on average; think dozens or hundreds of milliseconds. Now, every time your process tries to read from this disk, all other work stops for quite a long time. The net effect is that the object server's throughput plummets while it spends most of its time blocked on IO from that one slow disk. Now, of course there's things we can do. The obvious one is to use a couple of IO threads per disk and push the blocking syscalls out there... and, in fact, Swift did that. In commit b491549, the object server gained a small threadpool for each disk[1] and started doing its IO there. This worked pretty well for avoiding the slow-disk problem. Requests that touched the slow disk would back up,
Re: [openstack-dev] [tc] supporting Go
On 5/11/16 7:09 AM, Thomas Goirand wrote: On 05/10/2016 09:56 PM, Samuel Merritt wrote: On 5/9/16 5:21 PM, Robert Collins wrote: On 10 May 2016 at 10:54, John Dickinson <m...@not.mn> wrote: On 9 May 2016, at 13:16, Gregory Haynes wrote: This is a bit of an aside but I am sure others are wondering the same thing - Is there some info (specs/etherpad/ML thread/etc) that has more details on the bottleneck you're running in to? Given that the only clients of your service are the public facing DNS servers I am now even more surprised that you're hitting a python-inherent bottleneck. In Swift's case, the summary is that it's hard[0] to write a network service in Python that shuffles data between the network and a block device (hard drive) and effectively utilizes all of the hardware available. So far, we've done very well by fork()'ing child processes, ... Initial results from a golang reimplementation of the object server in Python are very positive[1]. We're not proposing to rewrite Swift entirely in Golang. Specifically, we're looking at improving object replication time in Swift. This service must discover what data is on a drive, talk to other servers in the cluster about what they have, and coordinate any data sync process that's needed. [0] Hard, not impossible. Of course, given enough time, we can do anything in a Turing-complete language, right? But we're not talking about possible, we're talking about efficient tools for the job at hand. ... I'm glad you're finding you can get good results in (presumably) clean, understandable code. Given go's historically poor perfornance with multiple cores (https://golang.org/doc/faq#Why_GOMAXPROCS) I'm going to presume the major advantage is in the CSP programming model - something that Twisted does very well: and frustratingly we've had numerous discussions from folk in the Twisted world who see the pain we have and want to help, but as a community we've consistently stayed with eventlet, which has a threaded programming model - and threaded models are poorly suited for the case here. At its core, the problem is that filesystem IO can take a surprisingly long time, during which the calling thread/process is blocked, and there's no good asynchronous alternative. Some background: With Eventlet, when your greenthread tries to read from a socket and the socket is not readable, then recvfrom() returns -1/EWOULDBLOCK; then, the Eventlet hub steps in, unschedules your greenthread, finds an unblocked one, and lets it proceed. It's pretty good at servicing a bunch of concurrent connections and keeping the CPU busy. On the other hand, when the socket is readable, then recvfrom() returns quickly (a few microseconds). The calling process was technically blocked, but the syscall is so fast that it hardly matters. Now, when your greenthread tries to read from a file, that read() call doesn't return until the data is in your process's memory. This can take a surprisingly long time. If the data isn't in buffer cache and the kernel has to go fetch it from a spinning disk, then you're looking at a seek time of ~7 ms, and that's assuming there are no other pending requests for the disk. There's no EWOULDBLOCK when reading from a plain file, either. If the file pointer isn't at EOF, then the calling process blocks until the kernel fetches data for it. Back to Swift: The Swift object server basically does two things: it either reads from a disk and writes to a socket or vice versa. There's a little HTTP parsing in there, but the vast majority of the work is shuffling bytes between network and disk. One Swift object server can service many clients simultaneously. The problem is those pauses due to read(). If your process is servicing hundreds of clients reading from and writing to dozens of disks (in, say, a 48-disk 4U server), then all those little 7 ms waits are pretty bad for throughput. Now, a lot of the time, the kernel does some readahead so your read() calls can quickly return data from buffer cache, but there are still lots of little hitches. But wait: it gets worse. Sometimes a disk gets slow. Maybe it's got a lot of pending IO requests, maybe its filesystem is getting close to full, or maybe the disk hardware is just starting to get flaky. For whatever reason, IO to this disk starts taking a lot longer than 7 ms on average; think dozens or hundreds of milliseconds. Now, every time your process tries to read from this disk, all other work stops for quite a long time. The net effect is that the object server's throughput plummets while it spends most of its time blocked on IO from that one slow disk. Now, of course there's things we can do. The obvious one is to use a couple of IO threads per disk and push the blocking syscalls out there... and, in fact, Swift did that. In commit b491549, the object server gained a small threadpool for each disk[1] and started doing its IO there. This worked pretty well for avoiding the slow-disk problem. Re