Re: cmake

2015-12-04 Thread Pete Zaitcev
On Thu, 3 Dec 2015 19:26:52 -0500 (EST)
Matt Benjamin  wrote:

> Could you share the branch you are trying to build?  (ceph/wip-5073 would not 
> appear to be it.)

It's the trunk with a few of my insignificant cleanups.

But I found a fix: deleting the CMakeFiles/ and CMakeCache.txt let
it run. Thanks again for the tip about the separate build directory.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cmake

2015-12-03 Thread Pete Zaitcev
On Thu, 3 Dec 2015 17:30:21 -0500
"Adam C. Emerson" <aemer...@redhat.com> wrote:

> On 03/12/2015, Pete Zaitcev wrote:

> > I'm trying to run cmake, in order to make sure my patches do not break it
> > (in particular WIP 5073 added source files). Result looks like this:
> > 
> > [zaitcev@lembas ceph-tip]$ cmake src
> 
> I believe the problem is 'cmake src'

Thanks for the tip about the separate build directory and the top-level
CMakeLists.txt. However, it still fails like this:

[zaitcev@lembas build]$ cmake ..
CMake Error at CMakeLists.txt:1 (include):
  include could not find load file:

GetGitRevisionDescription
...

Do you know by any chance where it gets that include? Also, what's
your cmake --version?

Greetings,
-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bucket namespaces pull req. 5872

2015-11-29 Thread Pete Zaitcev
On Mon, 26 Oct 2015 16:09:41 +0100
Radoslaw Zarzynski  wrote:

> > The rgw_swift_account_in_url should be possible to incorporate
> > in a compatible fashion (it does not add an extra next_tok()).
> 
> According to "rgw_swift_account_in_url": I don’t see viable method for
> deducing whether two tokens in URL refer to 1) account and bucket or
> 2) bucket and object. Of course, we may apply some kind of heuristic
> like scanning the first token for auth prefix (eg. “AUTH_”, “KEY_”) but
> this would introduce limitations on bucket naming.

I thought a bit more about this and I want to backtrack on my agreement.
Your reasoning would be sound if we did not know the format of the
incoming URL ahead of time. Indeed there's no telling if it's
/account/bucket or /bucket/object. But actually we do have that knowledge,
because it's the URL that we gave to the client when it authenticated!

There may be an exception, namely if a client gets a token across a cluster
upgrade. For a while we should recognize old tokens and thus old StorageURL
formats. This makes me think that the kind of path parsing could be a
parameter hidden in the token somewhere.

Anyhow, rather than speculate about it, I'm going to put together a patch
to go on top of current WIP 5073 and then you'll see what I mean.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RGW multi-tenancy APIs overview

2015-11-18 Thread Pete Zaitcev
On Mon, 9 Nov 2015 21:36:47 -0800
Yehuda Sadeh-Weinraub  wrote:

> In the supported domains configuration, we can specify for each domain
> whether a subdomain for it would be a bucket (as it is now), or
> whether it would be a tenant (which implies the possibility of
> bucket.tenant). This only affects the global (a.k.a the "empty")
> tenant.
> 
> E.g., we can have two domains:
> 
> legacy-foo.com
> new-foo.com
> 
> We'd specify that legacy-foo.com is a global tenant endpoint. In which
> case, when accessing buck.legacy-foo.com, it will access the global
> tenant, and bucket=buck.
> Whereas, new-foo.com isn't a global tenant endpoint, in which case, if
> we'd access buck.new-foo.com, it will mean that we accessed the 'buck'
> tenant.

I think I found another issue with this. Suppose we want a client authenticated
under an explicit tenant accessing a legacy bucket. The only way for it to
work is for it to use a different endpoint (in your example above it's
legacy-foo.com). The client cannot use buck.new-foo.com syntax, as mentioned.
So, there's a certain asymmetry built into the system.

Oddly enough, the X-amz-copy-source: syntax always includes bucket, and
tenant:bucket syntax is recognized there, so miraclously we're good there.

-- Pete

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RGW multi-tenancy APIs overview

2015-11-17 Thread Pete Zaitcev
On Mon, 9 Nov 2015 21:36:47 -0800
Yehuda Sadeh-Weinraub  wrote:

We discussed this a bit on RGW team meeting in BJ, and there were some
developments, so for the sake of update here goes.

> > #1 Back-end and radosgw-admin use '/' or "tenant/bucket". This is what is
> > literally stored in RADOS, because it's used to name bucket objects in
> > the .rgw pool.

This works.

> > #2 Buckets in Swift URLs use '\' (backslash), because there does not seem
> > to be a way to use '/'. Example:
> >  http://host.corp.com:8080/swift/v1/testen\testcont

> > Note that strictly speaking, we don't really need this, since Swift URLs
> > could easily include tenant names where reference Swift places account 
> > names.
> > It's just easier to implement without disturbing authenthication code.
> 
> I think that leveraging the native swift URL tenant encoding is
> probably a cleaner solution than having it encoded as a backslash.

Indeed, I clearly took a lazy way out. Backslashes are removed from the
current pull request #6358, to be replaced with the right solution through
the auth.

> > #3 S3 host addressing of buckets
> >
> > This is similar to Swift and is slated to use backslash. Note that S3
> > prohibits it, so we're reasonably safe with this choice.

We circled back to this in the meeting and replaced backslashes with colons,
for a cleaner look.

Note that in a potentially major development, the tenant name and the colon
are included into the string that is used to calculate HMAC. So, theoretically
stock clients could work, by packing tenant into buckets - except if they
use some kind of configuration syntax with colons. One motivation for
backslash was that it could be easier to make clients to eat. This is
something we ought to test and possibly tweak.

> > #4 S3 URL addressing of buckets
> >
> > Here we must use a period. Example:
> >  bucket.tenant.host.corp.com
> 
> Can probably identify this automatically, if the host is at a
> subdomain of a supported domain, and it's a second level subdomain
> from the main domain then we can regard it as .

This part suffered a major setback. And oddly enough nobody raised any
objections until now. But periods are permitted in bucket names in S3.
Therefore the "bucket.tenant.host.domain" thing cannot work.

I'm inclined to document this and simply tell people who want to access
buckets across tenants to use URL paths per above. However...

> In the supported domains configuration, we can specify for each domain
> whether a subdomain for it would be a bucket (as it is now), or
> whether it would be a tenant (which implies the possibility of
> bucket.tenant). This only affects the global (a.k.a the "empty")
> tenant.
> 
> E.g., we can have two domains:
> 
> legacy-foo.com
> new-foo.com
> 
> We'd specify that legacy-foo.com is a global tenant endpoint. In which
> case, when accessing buck.legacy-foo.com, it will access the global
> tenant, and bucket=buck.
> Whereas, new-foo.com isn't a global tenant endpoint, in which case, if
> we'd access buck.new-foo.com, it will mean that we accessed the 'buck'
> tenant.

Okay, that is plausible (meaning: I can make it work). This scheme has
a little downside in it: these DNS domains cannot nest. In other words,
corp.com cannot be migrated to tenantized-rados.corp.com. But otherwise,
I don't see it can't work.

> > #5 Listings and redirects.
> >
> > Listings present a difficulty in S3: we don't know if the name will be
> > used in host-based or URL-based addressing of a bucket. So, we put the
> > tenant of a bucket into a separate XML attribute.
> 
> You mean a separate http header? http param?

Not sure what that question assumed. A separate XML attribute looks like:

http://s3.amazonaws.com/doc/2006-03-01;>
 foobucket
 footenant  <=== this one
 false

There's a certain problem with this, in case the client is constructing
the URLs for further access. In case it's trying to access across tenants,
it has to fetch the tenant name from the attribute. I thought about returning
the bucket name as footenant:foobucket for all buckets that
belong to non-empty tenant, but that seems asking for compatibility issues
even for access within the tenant.

> > Since Swift listings are always in a specific account, and thus tenant,
> > they are unchanged.
> >
> > In addition to listings, bucket names leak into certain HTTP headers, where
> > we add "Tenant:" headers as appropriate.
> >
> > Finally, multi-tenancy also puts user_uid namespaces under tenants as well
> > as bucket namespaces. That one is easy though. A '$' separator is used
> > consistently for it (tenant$user).

> Does that work the same for object copy, and acls?

ACLs do not list buckets, only users, which may be qualified (tenant$user).
COPY verbs use colons in S3. Fortunately there's no additional complication
with HMAC. In Swift, currently, all that support is removed together with
the backslash and is waiting for "new" URLs.

-- Pete
--
To unsubscribe 

Re: RGW multi-tenancy APIs overview

2015-11-17 Thread Pete Zaitcev
On Tue, 17 Nov 2015 16:00:12 -0800
Yehuda Sadeh-Weinraub  wrote:

> > http://s3.amazonaws.com/doc/2006-03-01;>
> >  foobucket
> >  footenant  <=== this one
> >  false
> >
> > There's a certain problem with this, in case the client is constructing
> > the URLs for further access. In case it's trying to access across tenants,
> > it has to fetch the tenant name from the attribute. I thought about 
> > returning
> > the bucket name as footenant:foobucket for all buckets that
> > belong to non-empty tenant, but that seems asking for compatibility issues
> > even for access within the tenant.
> 
> Ah, I understand this point now. Note that a user will only have
> buckets under its own tenant (not going to own buckets from another
> tenant), so I'm not sure we need to add this.

Good point. I think I'll drop that.

> >> Does that work the same for object copy, and acls?
> >
> > ACLs do not list buckets, only users, which may be qualified (tenant$user).
> 
> Not tenant:user?

I forgot what happened when we did tenant:user. There was some kind
of metadata syntax that used colon somewhere in one of APIs. I shall
re-examine that and add a code comment near to_str() with dollar.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RGW multi-tenancy APIs overview

2015-11-09 Thread Pete Zaitcev
With ticket 5073 getting close to complete, we're getting the APIs mostly
nailed down. Most of them come down to selection a syntax separator
character. Unfortunately, there are several such characters. Plus,
it is not always feasible to get by with a character (in S3 at least).

So far we have the following changes:

#1 Back-end and radosgw-admin use '/' or "tenant/bucket". This is what is
literally stored in RADOS, because it's used to name bucket objects in
the .rgw pool.

#2 Buckets in Swift URLs use '\' (backslash), because there does not seem
to be a way to use '/'. Example:
 http://host.corp.com:8080/swift/v1/testen\testcont

At first, I tried URL encoding (%2f), but that didn't work: we permit '%'
in Swift container names, so there's a show-stopper compatibility problem.
So, backslash. The backslash poses a similar problem, too, but hopefuly
nobody created a container with backslash in name.

Note that strictly speaking, we don't really need this, since Swift URLs
could easily include tenant names where reference Swift places account names.
It's just easier to implement without disturbing authenthication code.

#3 S3 host addressing of buckets

This is similar to Swift and is slated to use backslash. Note that S3
prohibits it, so we're reasonably safe with this choice.

#4 S3 URL addressing of buckets

Here we must use a period. Example:
 bucket.tenant.host.corp.com

#5 Listings and redirects.

Listings present a difficulty in S3: we don't know if the name will be
used in host-based or URL-based addressing of a bucket. So, we put the
tenant of a bucket into a separate XML attribute.

Since Swift listings are always in a specific account, and thus tenant,
they are unchanged.

In addition to listings, bucket names leak into certain HTTP headers, where
we add "Tenant:" headers as appropriate.

Finally, multi-tenancy also puts user_uid namespaces under tenants as well
as bucket namespaces. That one is easy though. A '$' separator is used
consistently for it (tenant$user).

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Symbolic links like feature on radosgw

2015-11-02 Thread Pete Zaitcev
On Mon, 2 Nov 2015 21:25:31 -0800
Guang Yang  wrote:

> Is this a valid feature request we can put into radosgw? The way I am
> thinking to implement is like symbolic link, the link object just
> contains a pointer to the original object.

It's not going to be sufficient. What do you think the Content-Size
and Content-Type should be for the HEAD and GET on the symlink object?
What I'm getting to though, for symlink to be of any use, it must
return the linked object on GETs. Yet, it must be distinguishable
(return some kind of system metadata). Also, it must be possible to
discover that it is, in fact, a symlink. A query parameter perhaps.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bucket namespaces pull req. 5872

2015-10-29 Thread Pete Zaitcev
On Mon, 26 Oct 2015 16:09:41 +0100
Radoslaw Zarzynski  wrote:

> Those are the reasons I’m behind keeping rgw_user even if the
> entire information it carries would be solely an ID string.

Okay. It felt a little obfuscatory but perhaps it's my kernel background 
talking.

> Yeah, bucket namespace is a part of account/RGWUserInfo*. Property
> “has_own_bns” even now is serialized together with RGWUserInfo.

Very well, we're good.

> > The rgw_swift_create_account_with_bns shold go away with rgw_user.
> 
> Option "rgw_swift_create_account_with_bns" is needed mostly due to
> integration with OpenStack (Keystone) when accounts* are automatically
> created at first access. Without the parameter you would lose ability to
> tell radosgw what is more important for you: compliance with Swift API
> or previous behavior that still may be useful in some cases. Creating
> massive amount of accounts by hand might not be an option here.

I'm buying the logic here: at the time of auto-creation, we do not
possess the information about the account being auto-created wanting BNS
or not. Still, it feels unsatisfactory. I'd rather look into some sort
of user attributes in Keystone or whatnot. I'll investigate and report.

> > The rgw_swift_account_in_url should be possible to incorporate
> > in a compatible fashion (it does not add an extra next_tok()).
> 
> According to "rgw_swift_account_in_url": I don’t see viable method for
> deducing whether two tokens in URL refer to 1) account and bucket or
> 2) bucket and object. Of course, we may apply some kind of heuristic
> like scanning the first token for auth prefix (eg. “AUTH_”, “KEY_”) but
> this would introduce limitations on bucket naming.

That makes sense, but it's not how I read the actual code. I'll look again.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bucket namespaces pull req. 5872

2015-10-14 Thread Pete Zaitcev
I took a decent look at the pull request 5872
  https://github.com/ceph/ceph/pull/5872
It implements something called "bucket namespaces": a way to make
buckets qualified with a prefix that permits different users use
buckets with the same name.

I think I like the idea overall, but the implementation raises
some questions. The most important in my mind is: why use rgw_user?

In the wip-5073, rgw_user is needed because tenant there adds
a namespace both to users and buckets. But here, users are not
in a namespace, only buckets are. Or at least that's what I see
in the code, please set me straight if I'm wrong.

Conceptually, the user name is just a label, and this patch keeps
those labels compatible. I think, the information about a user
should contain the user's bucket namespace, but the user's label
does not need to have it. So, RGWUserInfo should have the bucket
namespace name (and possibly has_own_bns), and rgw_user is superfluous.

If we could get rid of rgw_user, I would be onboard with this.

Less importantly, I do not like the generosity with knobs.
The rgw_swift_create_account_with_bns shold go away with rgw_user.
The rgw_swift_account_in_url should be possible to incorporate
in a compatible fashion (it does not add an extra next_tok()).
The rgw_keystone_accepted_admin_roles... okay, that one might
be needed. Swift has an equivalent of it.

Finally, there are some miniscule technical issues.
 - Is it just me, or do encoding and decoding of RGWUserInfo do
   not match?  Decoding appears to make provision for wip-5073,
   which we may not even need.
 - The --own-bucket-namespace should not be a boolean, but the
   namespace's name.
 - There's some junk imported from wip-5073; I'll work on cleaning
   that up.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [rgw] Multi-tenancy support in radosgw

2015-09-14 Thread Pete Zaitcev
On Sat, 12 Sep 2015 00:24:27 +0200
Radoslaw Zarzynski  wrote:

> Each already existing user would obtain empty bucket namespace
> by default. It will be possible to create user  with his own, unique
> namespace. [...]

> 2. We will always need ID of namespace in order to access proper
> bucket entry points. [...]

> I would like to ask for reviews of the idea and feedback.

I still don't understand how this is different from tenants in wip-5073.
Each tenant defines what amounts to a namespace for buckets. Could you
clear this up? I heard you discussing it a little bit during the RGW
team stand-up call, but I can't wrap my head around it.

Secondly, have you given a thought to exact API here, or is it all
hand-waved for now? As you may know, initially I hoped we'd get by
without adding any special syntax for tenants. Just off-load it onto
the authentication system, I thought. That didn't work and now we
have the tenant$user syntax allowed all over.

What I mean by that, a client has to do "swift -U tenant\$user:subuser -K pass"
in 5073, and that looks a bit fraught. I presume you avoided that.

Specifically though, imagine I'm doing "radosgw-admin user info",
what do I get under your plan?

-- Pete

P.S. What repo/branch do you use for this? I know you quoted it before,
but just so we're at the same place.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Multi-tenancy in radosgw

2015-09-02 Thread Pete Zaitcev
On Wed, 2 Sep 2015 14:23:33 +0200
Radoslaw Zarzynski  wrote:

> What is the current status of multi-tenancy support in radosgw?
> I saw branch wip-ymt-5073 which looks really nice.

That branch is actually an clone of wip-5073-3, only I made it build on the
current code (about 9.0.2 level) and pass "make check". I also streamlined
a couple of things, like the confusing maze of assignments between tenant
and user_id.tenant in radosgw-admin (rgw_admin.cc). Oh, and we settled on
using '/' as the syntax separator for buckets in tenant namespace. The
original wip-5073-3 have not yet come to that decision.

> What is missing? What do we need to implement?
> I guess I would have time to help with this.

Well... In theory everything is in place. In practice, nothing works. :-)
If you want to create buckets within a non-default tenant, you have to
have a user sited under a non-default tenant. But if you try to create
one, this happens:

$ LD_LIBRARY_PATH=/q/zaitcev/ceph/ceph-tip/src/.libs 
/q/zaitcev/ceph/ceph-tip/src/.libs/radosgw-admin  user create --tenant=prodtx 
--uid=prodt --subuser=prodt:prod1 --display-name="Prod Tenant T" 
--key-type=swift --access=full --email=pr...@zaitcev.lan
{
"user_id": "prodtx:prodt",
"display_name": "Prod Tenant T",
"email": "pr...@zaitcev.lan",
"suspended": 0,
"max_buckets": 1000,   
"auid": 0,
"subusers": [
{
"id": "prodtx:prodt:prod1",
"permissions": "full-control"
}
],
"swift_keys": [
{
"user": "prodtx:prodt:prod1",
"secret_key": "rfJdJIDxCypUVgld2OUPpjgOUQ6R2UycHpqxuQyP"
}
],
.
}

As you can see, tenant leaks _everywhere_. It's a real pest.
In addition, Yehuda chose to use a struct rgw_user as a vehicle to carry
the tenant. That has important advantages, the biggest being that you
let gcc to point out all the place where conversion may be needed.
However, per above the serialization into JSON relies onto to_str(),
and then you cannot tell if there's actually a tenant field set
correctly in RGWUserInfo or it's just a colon leaked into user.

As far as where you are best to apply, I honestly have no clue.
This is not easy to decompose further into tasks. Maybe you could
just take over :-)

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RGWUserInfo encode/decode

2015-08-13 Thread Pete Zaitcev
Hi,

I stumpbled upon a code like this in src/rgw/rgw_common.h:

struct RGWUserInfo {
  mapstring, RGWAccessKey swift_keys;  // not one swift_key, a whole map
  void decode(bufferlist::iterator bl) {
string swift_key;
if (struct_v = 4) ::decode(swift_key, bl);
// that's all, folks
  }
}

Looks like swift_key is never set when RGWUserInfo is deserialized.
It looks like a bug on the surface, but nobody cared thus far, so maybe
it's not actually a problem.

Do you happen to recall where this is used... Does radosgw-admin send
a whole RGWUserInfo to RADOS, ever? It appears to go the other way, in
show_user_info(), but strangely enough the tool shows Swift keys just fine.

-- Pete
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Civetweb and mg_set_http_status

2015-08-04 Thread Pete Zaitcev
I did a git pull today and now the build breaks like this:

  CXX  rgw/libcivetweb_la-rgw_civetweb.lo
rgw/rgw_civetweb.cc: In member function 'virtual int 
RGWMongoose::send_status(const char*, const char*)':
rgw/rgw_civetweb.cc:147:38: error: 'mg_set_http_status' was not declared in 
this scope
   mg_set_http_status(conn, status_num);
  ^
It comes from this:

commit b8e28ab9f914bf48c9ba4f0def9a0deb9dbb93bc
Author: Yehuda Sadeh yeh...@redhat.com
Date:   Wed Jul 22 10:01:00 2015 -0700

rgw: set http status in civetweb

Need to set the http status in civetweb so that we report it correctly.
Fixes: #12432

Signed-off-by: Yehuda Sadeh yeh...@redhat.com

I suspect my civetweb may be obsolete. The cd src/civetweb  git tag -l
returns v1.5. Do I need a magic git incantation to refresh submodules?

Thanks in advance,
-- Pete
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Moving away from Yum/DNF repo priorities for Ceph and ceph-deploy

2015-07-24 Thread Pete Zaitcev
On Thu, 23 Jul 2015 16:12:59 -0700
Travis Rhoden trho...@redhat.com wrote:

 I’m working on ways to improve Ceph installation with ceph-deploy, and
 a common hurdle we have hit involves dependency issues between ceph.com
 hosted RPM repos, and packages within EPEL.  For a while we were able to
 managed this with the priorities plugin, but then EPEL shipped packages
 that included changes that weren’t available on the ceph.com packages,
 and the EPEL packages “obsoleted” the ceph.com ones. [...]

This happens every time repos want to conflict with packages provided
by the base distro or EPEL. I think the only sensible way to manage it is
not to conflict and rely on the base packages. That means getting the right
versions in. In old EPEL/CentOS/RHEL this may mean versioned names or other
tricks, but the point is to get them in. You have access to experienced
packagers such as Haikel G., who can help you.

So, I'm just curious, what are the specific pain points that we're hitting,
which prevent using EPEL packages? What are you duplicating and why?

-- Pete
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is omap

2015-06-26 Thread Pete Zaitcev
On Fri, 26 Jun 2015 14:48:15 +0100
Gregory Farnum g...@gregs42.com wrote:

 Each object consists of three different data storage areas, all of
 which are 100% optional: the bundle of bits object data, the object
 xattrs, and the object omap key-value store.

Thanks for the explanation. This was unexpected. I thought an object had
one K/V namespace associated with it, with xattrs and omap being two
implementations of it, switcheable depending on OSD configuration.

 Each OSD has its *own* local leveldb where all that data
 goes; there's no cross-OSD LevelDB replication or communication.

Okay, that makes sense.

So, do I understand right that we don't have a document that
explains the above? All I want to write is the list of buckets
is kept in omap of object such and such; see URL FOO for explanation
of omap.

-- Pete
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What is omap

2015-06-24 Thread Pete Zaitcev
Dear All:

I am a complete beginner in Ceph and due to various circumstances I became
curious about omap. I made some quick web searches and read some random
writings about it, but I still have basic questions.

 - If I had to refer to omap in a doc/foo.rst, is there a canonical
   web-page, blog post, mailing list thread, or a (white)paper that
   serves to explain what it is?

   So far the best explanation I found was in ceph-devel archives
   by Gregory Farnum.

 - Every casual explanation I found presumes that an omap (a set
   of K/V) is associated with an object. But it is not physically in
   the object. So, is there a free-standing omap (set of keys)?
   Or an omap associated with something else, like a pool?

 - Greg says  The on-disk layout for these is more complicated
   (read about leveldb if you're interested). Fair enough... But would
   someone be willing to shortcut this for me by explaining how this
   works in concert with OSD. In particular, does OSD replicate
   the files that LevelDB uses (like, for instance, Swift replicates
   SQLite files), or does it rely on LevelDB's own replication?
   Or perhaps OSD replicates upper level view of K/V provided by
   LevelDB in a node and ignores its actual on-disk layout?

Thanks in advance,
-- Pete
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xattrs vs. omap with radosgw

2015-06-24 Thread Pete Zaitcev
On Tue, 16 Jun 2015 12:43:08 -0700 (PDT)
Sage Weil s...@newdream.net wrote:

 With hammer the size of object_info_t crossed the 255 byte boundary, which 
 is the max xattr value that XFS can inline.  We've since merged something 
 that stripes over several small xattrs so that we can keep things inline, 
 but it hasn't been backported to hammer yet.  See
 c6cdb4081e366f471b372102905a1192910ab2da.

Meanwhile, Swift stopped striping altogether:
 
https://github.com/openstack/swift/commit/cc2f0f4ed6f12554b7d8e8cb61e14f2b103445a0

(but yes, it's still advantageous to fit into an inode)

-- Pete
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Client debugging and rados_conf_set()

2015-02-25 Thread Pete Zaitcev
On Tue, 24 Feb 2015 21:36:07 -0800 (PST)
Sage Weil s...@newdream.net wrote:

ldout(cct, 1)  setting wanted keys  dendl;
  rc = rados_conf_set(rados, debug_rados, 20/20);

   rados_conf_set(rados, log_to_stderr, true);

Thanks a lot, that worked!

-- Pete
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Client debugging and rados_conf_set()

2015-02-24 Thread Pete Zaitcev
Hello:

I noticed that RadosClient.cc is full of interesting printouts, like so:

  ldout(cct, 1)  setting wanted keys  dendl;

So, I tried to enable them by doing this:

rc = rados_conf_set(rados, debug_rados, 20/20);

However, the above produces no output. What am I missing? How are these
printouts made visible conventionally?

Thanks in advance,
-- Pete
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html