Re: [ceph-users] Use telegraf/influx to detect problems is very difficult

2019-12-11 Thread Mario Giammarco
Miroslav replied better for us why "is not so simple" to use math.
And osd down was the easiest. How can I calculate:
- monitor down
- osd near full

?

I do not understand why ceph plugin cannot send to influx all the metrics
it has, especially the most useful for creating alarms.

Il giorno mer 11 dic 2019 alle ore 04:58 Konstantin Shalygin 
ha scritto:

> But it is very difficult/complicated to make simple queries because, for
> example I have osd up and osd total but not osd down metric.
>
>
> To determine how much osds down you don't need special metric, because you
> already
>
> have osd_up and osd_in metrics. Just use math.
>
>
>
>
> k
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Use telegraf/influx to detect problems is very difficult

2019-12-10 Thread Mario Giammarco
Hi,
I enabled telegraf and influx plugins for my ceph cluster.
I would like to use influx/chronograf to detect anomalies:

- osd down
- monitor down
- osd near full

But it is very difficult/complicated to make simple queries because, for
example I have osd up and osd total but not osd down metric.
Or I do not see any osd near full metric.

I suppose I am missing something, can you help me?
I have seen in old threads that someone else has similar problems but I
have seen no solutions.
Thanks,
Mario
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to recover from corrupted RocksDb

2018-11-29 Thread Mario Giammarco
The only strange thing is that ceph-bluestore-tool says that repair was
done, no errors are found and all is ok.
I ask myself what really does that tool.
Mario

Il giorno gio 29 nov 2018 alle ore 11:03 Wido den Hollander 
ha scritto:

>
>
> On 11/29/18 10:45 AM, Mario Giammarco wrote:
> > I have only that copy, it is a showroom system but someone put a
> > production vm on it.
> >
>
> I have a feeling this won't be easy to fix or actually fixable:
>
> - Compaction error: Corruption: block checksum mismatch
> - submit_transaction error: Corruption: block checksum mismatch
>
> RocksDB got corrupted on that OSD and won't be able to start now.
>
> I wouldn't know where to start with this OSD.
>
> Wido
>
> > Il giorno gio 29 nov 2018 alle ore 10:43 Wido den Hollander
> > mailto:w...@42on.com>> ha scritto:
> >
> >
> >
> > On 11/29/18 10:28 AM, Mario Giammarco wrote:
> > > Hello,
> > > I have a ceph installation in a proxmox cluster.
> > > Due to a temporary hardware glitch now I get this error on osd
> startup
> > >
> > > -6> 2018-11-26 18:02:33.179327 7fa1d784be00  0 osd.0 1033
> > crush map
> > > has features 1009089991638532096, adjusting msgr requires for
> > osds
> > >-5> 2018-11-26 18:02:34.143084 7fa1c33f9700  3 rocksdb:
> > >
> >  [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1591]
> > > Compaction error: Corruption: block checksum mismatch
> > > -4> 2018-11-26 18:02:34.143123 7fa1c33f9700 4 rocksdb:
> > (Original Log
> > > Time 2018/11/26-18:02:34.143021)
> > > [/build/ceph-12.2.9/src/rocksdb/db/compaction_job.cc:621]
> > [default]
> > > compacted to: base level 1 max bytes base268435456 files[17$
> > >
> > > -3> 2018-11-26 18:02:34.143126 7fa1c33f9700 4 rocksdb:
> > (Original Log
> > > Time 2018/11/26-18:02:34.143068) EVENT_LOG_v1 {"time_micros":
> > > 1543251754143044, "job": 3, "event": "compaction_finished",
> > > "compaction_time_micros": 1997048, "out$
> > >-2> 2018-11-26 18:02:34.143152 7fa1c33f9700  2 rocksdb:
> > >
> >  [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1275]
> > > Waiting after background compaction error: Corruption: block
> > > checksum mismatch, Accumulated background err$
> > >-1> 2018-11-26 18:02:34.674171 7fa1c4bfc700 -1 rocksdb:
> > > submit_transaction error: Corruption: block checksum mismatch
> > code =
> > > 2 Rocksdb transaction:
> > > Delete( Prefix = O key =
> > >
> >
>   
> 0x7f7ffb6400217363'rub_3.26!='0xfffe'o')
> > > Put( Prefix = S key = 'nid_max' Value size = 8)
> > > Put( Prefix = S key = 'blobid_max' Value size = 8)
> > > 0> 2018-11-26 18:02:34.675641 7fa1c4bfc700 -1
> > > /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: In function
> > 'void
> > > BlueStore::_kv_sync_thread()' thread 7fa1c4bfc700 time
> 2018-11-26
> > > 18:02:34.674193
> > > /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: 8717: FAILED
> > > assert(r == 0)
> > >
> > > ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217)
> > > luminous (stable)
> > > 1: (ceph::__ceph_assert_fail(char const*, char const*, int,
> char
> > > const*)+0x102) [0x55ec83876092]
> > > 2: (BlueStore::_kv_sync_thread()+0x24b5) [0x55ec836ffb55]
> > > 3: (BlueStore::KVSyncThread::entry()+0xd) [0x55ec8374040d]
> > > 4: (()+0x7494) [0x7fa1d5027494]
> > > 5: (clone()+0x3f) [0x7fa1d4098acf]
> > >
> > >
> > > I have tried to recover it using ceph-bluestore-tool fsck and
> repair
> > > DEEP but it says it is ALL ok.
> > > I see that rocksd ldb tool needs .db files to recover and not a
> > > partition so I cannot use it.
> > > I do not understand why I cannot start osd if ceph-bluestore-tools
> > says
> > > me I have lost no data.
> > > Can you help me?
> >
> > Why would you try to recover a individual OSD? If all your Placement
> > Groups are active(+clean) just wipe the OSD and re-

Re: [ceph-users] How to recover from corrupted RocksDb

2018-11-29 Thread Mario Giammarco
I have only that copy, it is a showroom system but someone put a production
vm on it.

Il giorno gio 29 nov 2018 alle ore 10:43 Wido den Hollander 
ha scritto:

>
>
> On 11/29/18 10:28 AM, Mario Giammarco wrote:
> > Hello,
> > I have a ceph installation in a proxmox cluster.
> > Due to a temporary hardware glitch now I get this error on osd startup
> >
> > -6> 2018-11-26 18:02:33.179327 7fa1d784be00  0 osd.0 1033 crush map
> > has features 1009089991638532096, adjusting msgr requires for osds
> >-5> 2018-11-26 18:02:34.143084 7fa1c33f9700  3 rocksdb:
> > [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1591]
> > Compaction error: Corruption: block checksum mismatch
> > -4> 2018-11-26 18:02:34.143123 7fa1c33f9700 4 rocksdb: (Original Log
> > Time 2018/11/26-18:02:34.143021)
> > [/build/ceph-12.2.9/src/rocksdb/db/compaction_job.cc:621] [default]
> > compacted to: base level 1 max bytes base268435456 files[17$
> >
> > -3> 2018-11-26 18:02:34.143126 7fa1c33f9700 4 rocksdb: (Original Log
> > Time 2018/11/26-18:02:34.143068) EVENT_LOG_v1 {"time_micros":
> > 1543251754143044, "job": 3, "event": "compaction_finished",
> > "compaction_time_micros": 1997048, "out$
> >-2> 2018-11-26 18:02:34.143152 7fa1c33f9700  2 rocksdb:
> > [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1275]
> > Waiting after background compaction error: Corruption: block
> > checksum mismatch, Accumulated background err$
> >-1> 2018-11-26 18:02:34.674171 7fa1c4bfc700 -1 rocksdb:
> > submit_transaction error: Corruption: block checksum mismatch code =
> > 2 Rocksdb transaction:
> > Delete( Prefix = O key =
> >
>  
> 0x7f7ffb6400217363'rub_3.26!='0xfffe'o')
> > Put( Prefix = S key = 'nid_max' Value size = 8)
> > Put( Prefix = S key = 'blobid_max' Value size = 8)
> > 0> 2018-11-26 18:02:34.675641 7fa1c4bfc700 -1
> > /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: In function 'void
> > BlueStore::_kv_sync_thread()' thread 7fa1c4bfc700 time 2018-11-26
> > 18:02:34.674193
> > /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: 8717: FAILED
> > assert(r == 0)
> >
> > ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217)
> > luminous (stable)
> > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> > const*)+0x102) [0x55ec83876092]
> > 2: (BlueStore::_kv_sync_thread()+0x24b5) [0x55ec836ffb55]
> > 3: (BlueStore::KVSyncThread::entry()+0xd) [0x55ec8374040d]
> > 4: (()+0x7494) [0x7fa1d5027494]
> > 5: (clone()+0x3f) [0x7fa1d4098acf]
> >
> >
> > I have tried to recover it using ceph-bluestore-tool fsck and repair
> > DEEP but it says it is ALL ok.
> > I see that rocksd ldb tool needs .db files to recover and not a
> > partition so I cannot use it.
> > I do not understand why I cannot start osd if ceph-bluestore-tools says
> > me I have lost no data.
> > Can you help me?
>
> Why would you try to recover a individual OSD? If all your Placement
> Groups are active(+clean) just wipe the OSD and re-deploy it.
>
> What's the status of your PGs?
>
> It says there is a checksum error (probably due to the hardware glitch)
> so it refuses to start.
>
> Don't try to outsmart Ceph, let backfill/recovery handle this. Trying to
> manually fix this will only make things worse.
>
> Wido
>
> > Thanks,
> > Mario
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to recover from corrupted RocksDb

2018-11-29 Thread Mario Giammarco
Hello,
I have a ceph installation in a proxmox cluster.
Due to a temporary hardware glitch now I get this error on osd startup

-6> 2018-11-26 18:02:33.179327 7fa1d784be00  0 osd.0 1033 crush map has
> features 1009089991638532096, adjusting msgr requires for osds
>-5> 2018-11-26 18:02:34.143084 7fa1c33f9700  3 rocksdb:
> [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1591]
> Compaction error: Corruption: block checksum mismatch
> -4> 2018-11-26 18:02:34.143123 7fa1c33f9700 4 rocksdb: (Original Log Time
> 2018/11/26-18:02:34.143021)
> [/build/ceph-12.2.9/src/rocksdb/db/compaction_job.cc:621] [default]
> compacted to: base level 1 max bytes base268435456 files[17$
>
> -3> 2018-11-26 18:02:34.143126 7fa1c33f9700 4 rocksdb: (Original Log Time
> 2018/11/26-18:02:34.143068) EVENT_LOG_v1 {"time_micros": 1543251754143044,
> "job": 3, "event": "compaction_finished", "compaction_time_micros":
>  1997048, "out$
>-2> 2018-11-26 18:02:34.143152 7fa1c33f9700  2 rocksdb:
> [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1275]
> Waiting after background compaction error: Corruption: block checksum
> mismatch, Accumulated background err$
>-1> 2018-11-26 18:02:34.674171 7fa1c4bfc700 -1 rocksdb:
> submit_transaction error: Corruption: block checksum mismatch code = 2
> Rocksdb transaction:
> Delete( Prefix = O key =
> 0x7f7ffb6400217363'rub_3.26!='0xfffe'o')
> Put( Prefix = S key = 'nid_max' Value size = 8)
> Put( Prefix = S key = 'blobid_max' Value size = 8)
> 0> 2018-11-26 18:02:34.675641 7fa1c4bfc700 -1
> /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: In function 'void
> BlueStore::_kv_sync_thread()' thread 7fa1c4bfc700 time 2018-11-26
> 18:02:34.674193
> /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: 8717: FAILED assert(r ==
> 0)
>
> ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous
> (stable)
> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x102) [0x55ec83876092]
> 2: (BlueStore::_kv_sync_thread()+0x24b5) [0x55ec836ffb55]
> 3: (BlueStore::KVSyncThread::entry()+0xd) [0x55ec8374040d]
> 4: (()+0x7494) [0x7fa1d5027494]
> 5: (clone()+0x3f) [0x7fa1d4098acf]
>
>
I have tried to recover it using ceph-bluestore-tool fsck and repair DEEP
but it says it is ALL ok.
I see that rocksd ldb tool needs .db files to recover and not a partition
so I cannot use it.
I do not understand why I cannot start osd if ceph-bluestore-tools says me
I have lost no data.
Can you help me?
Thanks,
Mario
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to really change public network in ceph

2018-02-21 Thread Mario Giammarco
I try to ask a simpler question: when I change monitors network and the
network of osds, how can monitors know the new addresses of osds?
Thanks,
Mario

2018-02-19 10:22 GMT+01:00 Mario Giammarco <mgiamma...@gmail.com>:

> Hello,
> I have a test proxmox/ceph cluster with four servers.
> I need to change the ceph public subnet from 10.1.0.0/24 to 10.1.5.0/24.
> I have read documentation and tutorials.
> The most critical part seems monitor map editing.
> But it seems to me that osds need to bind to new subnet too.
> I tried to put 10.1.0 and 10.1.5 subnets to public but it seems it changes
> nothing.
> Infact official documentation is unclear: it says you can put in public
> network more than one subnet. It says they must be routed. But it does not
> say what happens when you use multiple subnets or why you should do it.
>
> So I need help on these questions:
> - exact sequence of operations to change public network in ceph (not only
> monitors, also osds)
> - details on multiple subnets in public networks in ceph
>
> Thanks in advance for any help,
> Mario
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to really change public network in ceph

2018-02-19 Thread Mario Giammarco
Hello,
I have a test proxmox/ceph cluster with four servers.
I need to change the ceph public subnet from 10.1.0.0/24 to 10.1.5.0/24.
I have read documentation and tutorials.
The most critical part seems monitor map editing.
But it seems to me that osds need to bind to new subnet too.
I tried to put 10.1.0 and 10.1.5 subnets to public but it seems it changes
nothing.
Infact official documentation is unclear: it says you can put in public
network more than one subnet. It says they must be routed. But it does not
say what happens when you use multiple subnets or why you should do it.

So I need help on these questions:
- exact sequence of operations to change public network in ceph (not only
monitors, also osds)
- details on multiple subnets in public networks in ceph

Thanks in advance for any help,
Mario
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cache tiering on Erasure coded pools

2018-01-03 Thread Mario Giammarco
Nobody explains why, I will tell you from direct experience: the cache tier
has a block size of several megabytes. So if you ask for one byte that is
not in cache some megabytes are read from disk and, if cache is full, some
other megabytes are written from cache to the EC pool.

Il giorno gio 28 dic 2017 alle ore 20:54 Karun Josy 
ha scritto:

> Hello David,
>
> Thank you!
> We setup 2 pools to use EC with RBD. One ecpool and other normal
> replicated pool.
>
> However, would it still be advantageous to add a replicated cache tier in
> front of an EC one, even though it is not required anymore? I would still
> assume that replication would be less intensive than EC computing?
>
>
> Karun Josy
>
> On Wed, Dec 27, 2017 at 3:42 AM, David Turner 
> wrote:
>
>> Please use the version of the docs for your installed version of ceph.
>> Now the Jewel in your URL and the Luminous in mine.  In Luminous you no
>> longer need a cache tier to use EC with RBDs.
>>
>> http://docs.ceph.com/docs/luminous/rados/operations/cache-tiering/
>>
>> On Tue, Dec 26, 2017, 4:21 PM Karun Josy  wrote:
>>
>>> Hi,
>>>
>>> We are using Erasure coded pools in a ceph cluster for RBD images.
>>> Ceph version is 12.2.2 Luminous.
>>>
>>> -
>>> http://docs.ceph.com/docs/jewel/rados/operations/cache-tiering/
>>> -
>>>
>>> Here it says we can use a Cache tiering infront of ec pools.
>>> To use erasure code with RBD we  have a replicated pool to store
>>> metadata and  ecpool as data pool .
>>>
>>> Is it possible to setup cache tiering since there is already a
>>> replicated pool that is being used ?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Karun Josy
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Moving bluestore WAL and DB after bluestore creation

2017-11-15 Thread Mario Giammarco
It seems it is not possible. I recreated the OSD

2017-11-12 17:44 GMT+01:00 Shawn Edwards :

> I've created some Bluestore OSD with all data (wal, db, and data) all on
> the same rotating disk.  I would like to now move the wal and db onto an
> nvme disk.  Is that possible without re-creating the OSD?
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bluestore compression statistics

2017-11-01 Thread Mario Giammarco
Hello,
I have enabled bluestore compression, how can I get some statistics just to
see if compression is really working?
Thanks,
Mario
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PGs inconsistent, do I fear data loss?

2017-11-01 Thread Mario Giammarco
I have read your post then read the thread you suggested, very interesting.
Then I read again your post and understood better.
The most important thing is that even with min_size=1 writes are
acknowledged after ceph wrote size=2 copies.
In the thread above there is:

As David already said, when all OSDs are up and in for a PG Ceph will
wait for ALL OSDs to Ack the write. Writes in RADOS are always
synchronous.

Only when OSDs go down you need at least min_size OSDs up before
writes or reads are accepted.

So if min_size = 2 and size = 3 you need at least 2 OSDs online for
I/O to take place.


You then show me a sequence of events that may happen in some use cases.
I tell you my use case which is quite different. We use ceph under proxmox.
The servers have disks on raid 5 (I agree that it is better to expose
single disks to Ceph but it is late).
So it is unlikely that a ceph disk fails because of raid. If a disks fail
probabably is because the entire server has failed (and we need to provide
business availability in this case) and so it will never come up again so
in my situation your sequence of events will never happen.
What shocked me is that I did not expect to see so many inconsistencies.
Thanks,
Mario


2017-11-01 16:45 GMT+01:00 David Turner <drakonst...@gmail.com>:

> It looks like you're running with a size = 2 and min_size = 1 (the
> min_size is a guess, the size is based on how many osds belong to your
> problem PGs).  Here's some good reading for you.  https://www.spinics.net/
> lists/ceph-users/msg32895.html
>
> Basically the jist is that when running with size = 2 you should assume
> that data loss is an eventuality and choose that it is ok for your use
> case.  This can be mitigated by using min_size = 2, but then your pool will
> block while an OSD is down and you'll have to manually go in and change the
> min_size temporarily to perform maintenance.
>
> All it takes for data loss is that an osd on server 1 is marked down and a
> write happens to an osd on server 2.  Now the osd on server 2 goes down
> before the osd on server 1 has finished backfilling and the first osd
> receives a request to modify data in the object that it doesn't know the
> current state of.  Tada, you have data loss.
>
> How likely is this to happen... eventually it will.  PG subfolder
> splitting (if you're using filestore) will occasionally take long enough to
> perform the task that the osd is marked down while it's still running, and
> this usually happens for some time all over the cluster when it does.
> Another option is something that causes segfaults in the osds; another is
> restarting a node before all pgs are done backfilling/recovering; OOM
> killer; power outages; etc; etc.
>
> Why does min_size = 2 prevent this?  Because for a write to be
> acknowledged by the cluster, it has to be written to every OSD that is up
> as long as there are at least min_size available.  This means that every
> write is acknowledged by at least 2 osds every time.  If you're running
> with size = 2, then both copies of the data need to be online for a write
> to happen and thus can never have a write that the other does not.  If
> you're running with size = 3, then you always have a majority of the OSDs
> online receiving a write and they can both agree on the correct data to
> give to the third when it comes back up.
>
> On Wed, Nov 1, 2017 at 3:31 AM Mario Giammarco <mgiamma...@gmail.com>
> wrote:
>
>> Sure here it is ceph -s:
>>
>> cluster:
>>id: 8bc45d9a-ef50-4038-8e1b-1f25ac46c945
>>health: HEALTH_ERR
>>100 scrub errors
>>Possible data damage: 56 pgs inconsistent
>>
>>  services:
>>mon: 3 daemons, quorum 0,1,pve3
>>mgr: pve3(active)
>>osd: 3 osds: 3 up, 3 in
>>
>>  data:
>>pools:   1 pools, 256 pgs
>>objects: 269k objects, 1007 GB
>>usage:   2050 GB used, 1386 GB / 3436 GB avail
>>pgs: 200 active+clean
>> 56  active+clean+inconsistent
>>
>> ---
>>
>> ceph health detail :
>>
>> PG_DAMAGED Possible data damage: 56 pgs inconsistent
>>pg 2.6 is active+clean+inconsistent, acting [1,0]
>>pg 2.19 is active+clean+inconsistent, acting [1,2]
>>pg 2.1e is active+clean+inconsistent, acting [1,2]
>>pg 2.1f is active+clean+inconsistent, acting [1,2]
>>pg 2.24 is active+clean+inconsistent, acting [0,2]
>>pg 2.25 is active+clean+inconsistent, acting [2,0]
>>pg 2.36 is active+clean+inconsistent, acting [1,0]
>>pg 2.3d is active+clean+inconsistent, acting [1,2]
>>pg 2.4b is active+clean+inconsistent, acting [1,0]
>>pg 2.4c is active+clean+inconsistent, acting [0,2]
>>pg 2.4d

Re: [ceph-users] PGs inconsistent, do I fear data loss?

2017-11-01 Thread Mario Giammarco
 20:30:01.443288",
   "last_became_active": "2017-10-15 20:30:35.752042",
   "last_became_peered": "2017-10-15 20:30:35.752042",
   "last_unstale": "2017-10-15 20:35:36.930611",
   "last_undegraded": "2017-10-15 20:30:35.749043",
   "last_fullsized": "2017-10-15 20:30:35.749043",
   "mapping_epoch": 1338,
   "log_start": "1274'68440",
   "ondisk_log_start": "1274'68440",
   "created": 90,
   "last_epoch_clean": 1331,
   "parent": "0.0",
   "parent_split_bits": 0,
   "last_scrub": "1294'71370",
   "last_scrub_stamp": "2017-10-15 09:27:31.756027",
   "last_deep_scrub": "1284'70813",
   "last_deep_scrub_stamp": "2017-10-14 06:35:57.556773",
   "last_clean_scrub_stamp": "2017-10-15 09:27:31.756027",
   "log_size": 3025,
   "ondisk_log_size": 3025,
   "stats_invalid": false,
   "dirty_stats_invalid": false,
   "omap_stats_invalid": false,
   "hitset_stats_invalid": false,
   "hitset_bytes_stats_invalid": false,
   "pin_stats_invalid": false,
   "stat_sum": {
   "num_bytes": 3555027456,
   "num_objects": 917,
   "num_object_clones": 255,
   "num_object_copies": 1834,
   "num_objects_missing_on_primary": 0,
   "num_objects_missing": 0,
   "num_objects_degraded": 917,
   "num_objects_misplaced": 0,
   "num_objects_unfound": 0,
   "num_objects_dirty": 917,
   "num_whiteouts": 0,
   "num_read": 275095,
   "num_read_kb": 111713846,
   "num_write": 64324,
   "num_write_kb": 11365374,
   "num_scrub_errors": 0,
   "num_shallow_scrub_errors": 0,
   "num_deep_scrub_errors": 0,
   "num_objects_recovered": 243,
   "num_bytes_recovered": 1008594432,
   "num_keys_recovered": 6,
   "num_objects_omap": 0,
   "num_objects_hit_set_archive": 0,
   "num_bytes_hit_set_archive": 0,
   "num_flush": 0,
   "num_flush_kb": 0,
   "num_evict": 0,
   "num_evict_kb": 0,
   "num_promote": 0,
   "num_flush_mode_high": 0,
   "num_flush_mode_low": 0,
   "num_evict_mode_some": 0,
   "num_evict_mode_full": 0,
   "num_objects_pinned": 0,
   "num_legacy_snapsets": 0
   },
   "up": [
   1,
   0
   ],
   "acting": [
   1,
   0
   ],
   "blocked_by": [],
   "up_primary": 1,
   "acting_primary": 1
   },
   "empty": 0,
   "dne": 0,
   "incomplete": 0,
   "last_epoch_started": 1339,
   "hit_set_history": {
   "current_last_update": "0'0",
   "history": []
   }
   }
   ],
   "recovery_state": [
   {
   "name": "Started/Primary/Active",
   "enter_time": "2017-10-15 20:36:33.574915",
   "might_have_unfound": [
   {
   "osd": "0",
   "status": "already probed"
   }
   ],
   "recovery_progress": {
   "backfill_targets": [],
   "waiting_on_backfill": [],
   "last_backfill_started": "MIN",
   "backfill_info": {
   "begin": "MIN",
   "end": "MIN",
   "objects": []
   },
   "peer_backfill_info": [],
   "backfills_in_flight": [],
   "recovering": [],
   "pg_backend": {
   "pull_from_peer": [],
   "pushing": []
   }
   },
   "scrub": {
   "scrubber.epoch_start": "1338",
   "scrubber.active": false,
   "scrubber.state": "INACTIVE",
   "scrubber.start": "MIN",
   "scrubber.end": "MIN",
   "scrubber.subset_last_update": "0'0",
   "scrubber.deep": false,
   "scrubber.seed": 0,
   "scrubber.waiting_on": 0,
   "scrubber.waiting_on_whom": []
   }
   },
   {
   "name": "Started",
   "enter_time": "2017-10-15 20:36:32.592892"
   }
   ],
   "agent_state": {}
}





2017-10-30 23:30 GMT+01:00 Gregory Farnum <gfar...@redhat.com>:

> You'll need to tell us exactly what error messages you're seeing, what the
> output of ceph -s is, and the output of pg query for the relevant PGs.
> There's not a lot of documentation because much of this tooling is new,
> it's changing quickly, and most people don't have the kinds of problems
> that turn out to be unrepairable. We should do better about that, though.
> -Greg
>
> On Mon, Oct 30, 2017, 11:40 AM Mario Giammarco <mgiamma...@gmail.com>
> wrote:
>
>>  >[Questions to the list]
>>  >How is it possible that the cluster cannot repair itself with ceph pg
>> repair?
>>  >No good copies are remaining?
>>  >Cannot decide which copy is valid or up-to date?
>>  >If so, why not, when there is checksum, mtime for everything?
>>  >In this inconsistent state which object does the cluster serve when it
>> doesn't know which one is the valid?
>>
>>
>> I am asking the same questions too, it seems strange to me that in a
>> fault tolerant clustered file storage like Ceph there is no
>> documentation about this.
>>
>> I know that I am pedantic but please note that saying "to be sure use
>> three copies" is not enough because I am not sure what Ceph really does
>> when three copies are not matching.
>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PGs inconsistent, do I fear data loss?

2017-10-30 Thread Mario Giammarco

>[Questions to the list]
>How is it possible that the cluster cannot repair itself with ceph pg 
repair?

>No good copies are remaining?
>Cannot decide which copy is valid or up-to date?
>If so, why not, when there is checksum, mtime for everything?
>In this inconsistent state which object does the cluster serve when it 
doesn't know which one is the valid?



I am asking the same questions too, it seems strange to me that in a 
fault tolerant clustered file storage like Ceph there is no 
documentation about this.


I know that I am pedantic but please note that saying "to be sure use 
three copies" is not enough because I am not sure what Ceph really does 
when three copies are not matching.






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PGs inconsistent, do I fear data loss?

2017-10-30 Thread Mario Giammarco

>In general you should find that clusters running bluestore are much more
>effective about doing a repair automatically (because bluestore has
>checksums on all data, it knows which object is correct!), but there are
>still some situations where they won't. If that happens to you, I 
would not

>follow directions to resolve it unless they have the *exact* same symptoms
>you do, or you've corresponded with the list about it. :)

Thanks, but it is happening to me, so what can I do?

BTW: I suppose that in my case problem is due by bitrot because in my 
test cluster I had  two disks with unreadable sectors and bluestore 
completely discarded them and put them out of cluster


So how does bluestore repair a pg? Does it move in another place of hdd?


Thanks,

Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] PGs inconsistent, do I fear data loss?

2017-10-28 Thread Mario Giammarco
Hello,
we recently upgraded two clusters to Ceph luminous with bluestore and we
discovered that we have many more pgs in state active+clean+inconsistent.
(Possible data damage, xx pgs inconsistent)

This is probably due to checksums in bluestore that discover more errors.

We have some pools with replica 2 and some with replica 3.

I have read past forums thread and I have seen that Ceph do not repair
automatically inconsistent pgs.

Even manual repair sometime fails.

I would like to understand if I am losing my data:

- with replica 2 I hope that ceph chooses right replica looking at checksums
- with replica 3 I hope that there are no problems at all

How can I tell ceph to simply create the second replica in another place?

Because I suppose that with replica 2 and inconsistent pgs I have only one
copy of data.

Thank you in advance for any help.

Mario
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Questions about bluestore

2017-10-14 Thread Mario Giammarco
Nobody can help me?

Il ven 6 ott 2017, 07:31 Mario Giammarco <mgiamma...@gmail.com> ha scritto:

> Hello,
> I am trying Ceph luminous with Bluestore.
>
> I create an osd:
>
> ceph-disk prepare --bluestore /dev/sdg  --block.db /dev/sdf
>
> and I see that on ssd it creates a partition of only 1g for block.db
>
> So:
>
> ceph-disk prepare --bluestore /dev/sdg --block.wal /dev/sdf --block.db
> /dev/sdf
>
> and again it creates two partitions, 1g and 500mb
>
> It seems to me that they are too small the ssd is underutilized (docs says
> you need a ssd greater than 1g to put a block.db on)
>
> Other two questions:
>
> - if I already have an osd bluestore can I move later the db on ssd?
> - docs says that I can add several block.db of different osds on one ssd.
> But what happens if ssd breaks?
>
> Thanks,
> Mario
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Another cluster completely hang

2016-06-30 Thread Mario Giammarco
Last two questions:
1) I have used other systems in the past. In case of split brain or serious
problems they offered me to choose which copy is "good" and then work
again. Is there a way to tell ceph that all is ok? This morning again I
have 19 incomplete pgs after recovery
2) Where can I find paid support? I mean someone that logs in to my cluster
and tell cephs that all is active+clean

Thanks,
Mario

Il giorno mer 29 giu 2016 alle ore 16:08 Mario Giammarco <
mgiamma...@gmail.com> ha scritto:

> This time at the end of recovery procedure you described it was like most
> pgs active+clean 20 pgs incomplete.
> After that when trying to use the cluster I got "request blocked more
> than" and no vm can start.
> I know that something has happened after the broken disk, probably a
> server reboot. I am investigating.
> But even if I find the origin of the problem it will not help in finding a
> solution now.
> So I am using my time in repairing the pool only to save the production
> data and I will throw away the rest.
> Now after marking all pgs as complete with ceph_objectstore_tool I see
> that:
>
> 1) ceph has put out three hdds ( I suppose due to scrub but it is my only
> my idea, I will check logs) BAD
> 2) it is recovering for objects degraded and misplaced GOOD
> 3) vm are not usable yet BAD
> 4) I see some pgs in state down+peering (I hope is not BAD)
>
> Regarding 1) how I can put again that three hdds in the cluster? Should I
> remove them from crush and start again?
> Can I tell ceph that they are not bad?
> Mario
>
> Il giorno mer 29 giu 2016 alle ore 15:34 Lionel Bouton <
> lionel+c...@bouton.name> ha scritto:
>
>> Hi,
>>
>> Le 29/06/2016 12:00, Mario Giammarco a écrit :
>> > Now the problem is that ceph has put out two disks because scrub  has
>> > failed (I think it is not a disk fault but due to mark-complete)
>>
>> There is something odd going on. I've only seen deep-scrub failing (ie
>> detect one inconsistency and marking the pg so) so I'm not sure what
>> happens in the case of a "simple" scrub failure but what should not
>> happen is the whole OSD going down on scrub of deepscrub fairure which
>> you seem to imply did happen.
>> Do you have logs for these two failures giving a hint at what happened
>> (probably /var/log/ceph/ceph-osd..log) ? Any kernel log pointing to
>> hardware failure(s) around the time these events happened ?
>>
>> Another point : you said that you had one disk "broken". Usually ceph
>> handles this case in the following manner :
>> - the OSD detects the problem and commit suicide (unless it's configured
>> to ignore IO errors which is not the default),
>> - your cluster is then in degraded state with one OSD down/in,
>> - after a timeout (several minutes), Ceph decides that the OSD won't
>> come up again soon and marks the OSD "out" (so one OSD down/out),
>> - as the OSD is out, crush adapts pg positions based on the remaining
>> available OSDs and bring back all degraded pg to clean state by creating
>> missing replicas while moving pgs around. You see a lot of IO, many pg
>> in wait_backfill/backfilling states at this point,
>> - when all is done the cluster is back to HEALTH_OK
>>
>> When your disk was broken and you waited 24 hours how far along this
>> process was your cluster ?
>>
>> Best regards,
>>
>> Lionel
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
This time at the end of recovery procedure you described it was like most
pgs active+clean 20 pgs incomplete.
After that when trying to use the cluster I got "request blocked more than"
and no vm can start.
I know that something has happened after the broken disk, probably a server
reboot. I am investigating.
But even if I find the origin of the problem it will not help in finding a
solution now.
So I am using my time in repairing the pool only to save the production
data and I will throw away the rest.
Now after marking all pgs as complete with ceph_objectstore_tool I see that:

1) ceph has put out three hdds ( I suppose due to scrub but it is my only
my idea, I will check logs) BAD
2) it is recovering for objects degraded and misplaced GOOD
3) vm are not usable yet BAD
4) I see some pgs in state down+peering (I hope is not BAD)

Regarding 1) how I can put again that three hdds in the cluster? Should I
remove them from crush and start again?
Can I tell ceph that they are not bad?
Mario

Il giorno mer 29 giu 2016 alle ore 15:34 Lionel Bouton <
lionel+c...@bouton.name> ha scritto:

> Hi,
>
> Le 29/06/2016 12:00, Mario Giammarco a écrit :
> > Now the problem is that ceph has put out two disks because scrub  has
> > failed (I think it is not a disk fault but due to mark-complete)
>
> There is something odd going on. I've only seen deep-scrub failing (ie
> detect one inconsistency and marking the pg so) so I'm not sure what
> happens in the case of a "simple" scrub failure but what should not
> happen is the whole OSD going down on scrub of deepscrub fairure which
> you seem to imply did happen.
> Do you have logs for these two failures giving a hint at what happened
> (probably /var/log/ceph/ceph-osd..log) ? Any kernel log pointing to
> hardware failure(s) around the time these events happened ?
>
> Another point : you said that you had one disk "broken". Usually ceph
> handles this case in the following manner :
> - the OSD detects the problem and commit suicide (unless it's configured
> to ignore IO errors which is not the default),
> - your cluster is then in degraded state with one OSD down/in,
> - after a timeout (several minutes), Ceph decides that the OSD won't
> come up again soon and marks the OSD "out" (so one OSD down/out),
> - as the OSD is out, crush adapts pg positions based on the remaining
> available OSDs and bring back all degraded pg to clean state by creating
> missing replicas while moving pgs around. You see a lot of IO, many pg
> in wait_backfill/backfilling states at this point,
> - when all is done the cluster is back to HEALTH_OK
>
> When your disk was broken and you waited 24 hours how far along this
> process was your cluster ?
>
> Best regards,
>
> Lionel
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
Just one question: why when ceph has some incomplete pgs it refuses to do
I/o on good pgs?

Il giorno mer 29 giu 2016 alle ore 12:55 Oliver Dzombic <
i...@ip-interactive.de> ha scritto:

> Hi,
>
> again:
>
> You >must< check all your logs ( as fucky as it is for sure ).
>
> Means on the ceph nodes in /var/log/ceph/*
>
> And go back to the time where things went down the hill.
>
> There must be something else going on, beyond normal osd crash.
>
> And your manual pg repair/pg remove/pg set complete is, most probably,
> just getting your situation worst.
>
> So really, if you want to have a chance to find out whats going on, you
> must check all the logs. Especially the OSD logs, especially the OSD log
> of the OSD you removed, and then the OSD logs of those pg, which are
> incomplete/stuck/what_ever_not_good.
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:i...@ip-interactive.de
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
>
>
> Am 29.06.2016 um 12:33 schrieb Mario Giammarco:
> > Thanks,
> > I can put in osds but the do not stay in, and I am pretty sure that are
> > not broken.
> >
> > Il giorno mer 29 giu 2016 alle ore 12:07 Oliver Dzombic
> > <i...@ip-interactive.de <mailto:i...@ip-interactive.de>> ha scritto:
> >
> > hi,
> >
> > ceph osd set noscrub
> > ceph osd set nodeep-scrub
> >
> > ceph osd in 
> >
> >
> > --
> > Mit freundlichen Gruessen / Best regards
> >
> > Oliver Dzombic
> > IP-Interactive
> >
> > mailto:i...@ip-interactive.de <mailto:i...@ip-interactive.de>
> >
> > Anschrift:
> >
> > IP Interactive UG ( haftungsbeschraenkt )
> > Zum Sonnenberg 1-3
> > 63571 Gelnhausen
> >
> > HRB 93402 beim Amtsgericht Hanau
> > Geschäftsführung: Oliver Dzombic
> >
> > Steuer Nr.: 35 236 3622 1
> > UST ID: DE274086107
> >
> >
> > Am 29.06.2016 um 12:00 schrieb Mario Giammarco:
> > > Now the problem is that ceph has put out two disks because scrub
> has
> > > failed (I think it is not a disk fault but due to mark-complete)
> > > How can I:
> > > - disable scrub
> > > - put in again the two disks
> > >
> > > I will wait anyway the end of recovery to be sure it really works
> > again
> > >
> > > Il giorno mer 29 giu 2016 alle ore 11:16 Mario Giammarco
> > > <mgiamma...@gmail.com <mailto:mgiamma...@gmail.com>
> > <mailto:mgiamma...@gmail.com <mailto:mgiamma...@gmail.com>>> ha
> scritto:
> > >
> > > Infact I am worried because:
> > >
> > > 1) ceph is under proxmox, and proxmox may decide to reboot a
> > server
> > > if it is not responding
> > > 2) probably a server was rebooted while ceph was reconstructing
> > > 3) even using max=3 do not help
> > >
> > > Anyway this is the "unofficial" procedure that I am using, much
> > > simpler than blog post:
> > >
> > > 1) find host where is pg
> > > 2) stop ceph in that host
> > > 3) ceph-objectstore-tool --pgid 1.98 --op mark-complete
> > --data-path
> > > /var/lib/ceph/osd/ceph-9 --journal-path
> > > /var/lib/ceph/osd/ceph-9/journal
> > > 4) start ceph
> > > 5) look finally it reconstructing
> > >
> > > Il giorno mer 29 giu 2016 alle ore 11:11 Oliver Dzombic
> > > <i...@ip-interactive.de <mailto:i...@ip-interactive.de>
> > <mailto:i...@ip-interactive.de <mailto:i...@ip-interactive.de>>> ha
> > scritto:
> > >
> > > Hi,
> > >
> > > removing ONE disk while your replication is 2, is no
> problem.
> > >
> > > You dont need to wait a single second to replace of remove
> > it. Its
> > > anyway not used and out/down. So from ceph's point of view
> its
> > > not existent.
> > >
> > > 
> > >
> > >

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
Thanks,
I can put in osds but the do not stay in, and I am pretty sure that are not
broken.

Il giorno mer 29 giu 2016 alle ore 12:07 Oliver Dzombic <
i...@ip-interactive.de> ha scritto:

> hi,
>
> ceph osd set noscrub
> ceph osd set nodeep-scrub
>
> ceph osd in 
>
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:i...@ip-interactive.de
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
>
>
> Am 29.06.2016 um 12:00 schrieb Mario Giammarco:
> > Now the problem is that ceph has put out two disks because scrub  has
> > failed (I think it is not a disk fault but due to mark-complete)
> > How can I:
> > - disable scrub
> > - put in again the two disks
> >
> > I will wait anyway the end of recovery to be sure it really works again
> >
> > Il giorno mer 29 giu 2016 alle ore 11:16 Mario Giammarco
> > <mgiamma...@gmail.com <mailto:mgiamma...@gmail.com>> ha scritto:
> >
> > Infact I am worried because:
> >
> > 1) ceph is under proxmox, and proxmox may decide to reboot a server
> > if it is not responding
> > 2) probably a server was rebooted while ceph was reconstructing
> > 3) even using max=3 do not help
> >
> > Anyway this is the "unofficial" procedure that I am using, much
> > simpler than blog post:
> >
> > 1) find host where is pg
> > 2) stop ceph in that host
> > 3) ceph-objectstore-tool --pgid 1.98 --op mark-complete --data-path
> > /var/lib/ceph/osd/ceph-9 --journal-path
> > /var/lib/ceph/osd/ceph-9/journal
> > 4) start ceph
> > 5) look finally it reconstructing
> >
> > Il giorno mer 29 giu 2016 alle ore 11:11 Oliver Dzombic
> > <i...@ip-interactive.de <mailto:i...@ip-interactive.de>> ha scritto:
> >
> > Hi,
> >
> > removing ONE disk while your replication is 2, is no problem.
> >
> > You dont need to wait a single second to replace of remove it.
> Its
> > anyway not used and out/down. So from ceph's point of view its
> > not existent.
> >
> > 
> >
> > But as christian told you already, what we see now fits to a
> > szenario
> > where you lost the osd and eighter you did something, or
> > something else
> > happens, but the data were not recovered again.
> >
> > Eighter because another OSD was broken, or because you did
> > something.
> >
> > Maybe, because of the "too many PGs per OSD (307 > max 300)"
> > ceph never
> > recovered.
> >
> > What i can see from http://pastebin.com/VZD7j2vN is that
> >
> > OSD 5,13,9,0,6,2,3 and maybe others, are the OSD's holding the
> > incomplete data.
> >
> > This are 7 OSD's from 10. So something happend to that OSD's or
> > the data
> > in them. And that had nothing to do with a single disk failing.
> >
> > Something else must have been happend.
> >
> > And as christian already wrote: you will have to go through your
> > logs
> > back until the point were things going down.
> >
> > Because a fail of a single OSD, no matter what your replication
> > size is,
> > can ( normally ) not harm the consistency of 7 other OSD's,
> > means 70% of
> > your total cluster.
> >
> > --
> > Mit freundlichen Gruessen / Best regards
> >
> > Oliver Dzombic
> > IP-Interactive
> >
> > mailto:i...@ip-interactive.de <mailto:i...@ip-interactive.de>
> >
> > Anschrift:
> >
> > IP Interactive UG ( haftungsbeschraenkt )
> > Zum Sonnenberg 1-3
> > 63571 Gelnhausen
> >
> > HRB 93402 beim Amtsgericht Hanau
> > Geschäftsführung: Oliver Dzombic
> >
> > Steuer Nr.: 35 236 3622 1
> > UST ID: DE274086107
> >
> >
> > Am 29.06.2016 um 10:56 schrieb Mario Giammarco:
> > > Yes I have removed it from crush because it was broken. I have
> > waited 24
> > > hours to see if cephs wou

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
Now the problem is that ceph has put out two disks because scrub  has
failed (I think it is not a disk fault but due to mark-complete)
How can I:
- disable scrub
- put in again the two disks

I will wait anyway the end of recovery to be sure it really works again

Il giorno mer 29 giu 2016 alle ore 11:16 Mario Giammarco <
mgiamma...@gmail.com> ha scritto:

> Infact I am worried because:
>
> 1) ceph is under proxmox, and proxmox may decide to reboot a server if it
> is not responding
> 2) probably a server was rebooted while ceph was reconstructing
> 3) even using max=3 do not help
>
> Anyway this is the "unofficial" procedure that I am using, much simpler
> than blog post:
>
> 1) find host where is pg
> 2) stop ceph in that host
> 3) ceph-objectstore-tool --pgid 1.98 --op mark-complete --data-path
> /var/lib/ceph/osd/ceph-9 --journal-path /var/lib/ceph/osd/ceph-9/journal
> 4) start ceph
> 5) look finally it reconstructing
>
> Il giorno mer 29 giu 2016 alle ore 11:11 Oliver Dzombic <
> i...@ip-interactive.de> ha scritto:
>
>> Hi,
>>
>> removing ONE disk while your replication is 2, is no problem.
>>
>> You dont need to wait a single second to replace of remove it. Its
>> anyway not used and out/down. So from ceph's point of view its not
>> existent.
>>
>> 
>>
>> But as christian told you already, what we see now fits to a szenario
>> where you lost the osd and eighter you did something, or something else
>> happens, but the data were not recovered again.
>>
>> Eighter because another OSD was broken, or because you did something.
>>
>> Maybe, because of the "too many PGs per OSD (307 > max 300)" ceph never
>> recovered.
>>
>> What i can see from http://pastebin.com/VZD7j2vN is that
>>
>> OSD 5,13,9,0,6,2,3 and maybe others, are the OSD's holding the
>> incomplete data.
>>
>> This are 7 OSD's from 10. So something happend to that OSD's or the data
>> in them. And that had nothing to do with a single disk failing.
>>
>> Something else must have been happend.
>>
>> And as christian already wrote: you will have to go through your logs
>> back until the point were things going down.
>>
>> Because a fail of a single OSD, no matter what your replication size is,
>> can ( normally ) not harm the consistency of 7 other OSD's, means 70% of
>> your total cluster.
>>
>> --
>> Mit freundlichen Gruessen / Best regards
>>
>> Oliver Dzombic
>> IP-Interactive
>>
>> mailto:i...@ip-interactive.de
>>
>> Anschrift:
>>
>> IP Interactive UG ( haftungsbeschraenkt )
>> Zum Sonnenberg 1-3
>> 63571 Gelnhausen
>>
>> HRB 93402 beim Amtsgericht Hanau
>> Geschäftsführung: Oliver Dzombic
>>
>> Steuer Nr.: 35 236 3622 1
>> UST ID: DE274086107
>>
>>
>> Am 29.06.2016 um 10:56 schrieb Mario Giammarco:
>> > Yes I have removed it from crush because it was broken. I have waited 24
>> > hours to see if cephs would like to heals itself. Then I removed the
>> > disk completely (it was broken...) and I waited 24 hours again. Then I
>> > start getting worried.
>> > Are you saying to me that I should not remove a broken disk from
>> > cluster? 24 hours were not enough?
>> >
>> > Il giorno mer 29 giu 2016 alle ore 10:53 Zoltan Arnold Nagy
>> > <zol...@linux.vnet.ibm.com <mailto:zol...@linux.vnet.ibm.com>> ha
>> scritto:
>> >
>> > Just loosing one disk doesn’t automagically delete it from CRUSH,
>> > but in the output you had 10 disks listed, so there must be
>> > something else going - did you delete the disk from the crush map as
>> > well?
>> >
>> > Ceph waits by default 300 secs AFAIK to mark an OSD out after it
>> > will start to recover.
>> >
>> >
>> >> On 29 Jun 2016, at 10:42, Mario Giammarco <mgiamma...@gmail.com
>> >> <mailto:mgiamma...@gmail.com>> wrote:
>> >>
>> >> I thank you for your reply so I can add my experience:
>> >>
>> >> 1) the other time this thing happened to me I had a cluster with
>> >> min_size=2 and size=3 and the problem was the same. That time I
>> >> put min_size=1 to recover the pool but it did not help. So I do
>> >> not understand where is the advantage to put three copies when
>> >> ceph can decide to discard all three.
>> >> 2) I started with 11 hdds. The hard disk failed. Ceph

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
Infact I am worried because:

1) ceph is under proxmox, and proxmox may decide to reboot a server if it
is not responding
2) probably a server was rebooted while ceph was reconstructing
3) even using max=3 do not help

Anyway this is the "unofficial" procedure that I am using, much simpler
than blog post:

1) find host where is pg
2) stop ceph in that host
3) ceph-objectstore-tool --pgid 1.98 --op mark-complete --data-path
/var/lib/ceph/osd/ceph-9 --journal-path /var/lib/ceph/osd/ceph-9/journal
4) start ceph
5) look finally it reconstructing

Il giorno mer 29 giu 2016 alle ore 11:11 Oliver Dzombic <
i...@ip-interactive.de> ha scritto:

> Hi,
>
> removing ONE disk while your replication is 2, is no problem.
>
> You dont need to wait a single second to replace of remove it. Its
> anyway not used and out/down. So from ceph's point of view its not
> existent.
>
> 
>
> But as christian told you already, what we see now fits to a szenario
> where you lost the osd and eighter you did something, or something else
> happens, but the data were not recovered again.
>
> Eighter because another OSD was broken, or because you did something.
>
> Maybe, because of the "too many PGs per OSD (307 > max 300)" ceph never
> recovered.
>
> What i can see from http://pastebin.com/VZD7j2vN is that
>
> OSD 5,13,9,0,6,2,3 and maybe others, are the OSD's holding the
> incomplete data.
>
> This are 7 OSD's from 10. So something happend to that OSD's or the data
> in them. And that had nothing to do with a single disk failing.
>
> Something else must have been happend.
>
> And as christian already wrote: you will have to go through your logs
> back until the point were things going down.
>
> Because a fail of a single OSD, no matter what your replication size is,
> can ( normally ) not harm the consistency of 7 other OSD's, means 70% of
> your total cluster.
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:i...@ip-interactive.de
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
>
>
> Am 29.06.2016 um 10:56 schrieb Mario Giammarco:
> > Yes I have removed it from crush because it was broken. I have waited 24
> > hours to see if cephs would like to heals itself. Then I removed the
> > disk completely (it was broken...) and I waited 24 hours again. Then I
> > start getting worried.
> > Are you saying to me that I should not remove a broken disk from
> > cluster? 24 hours were not enough?
> >
> > Il giorno mer 29 giu 2016 alle ore 10:53 Zoltan Arnold Nagy
> > <zol...@linux.vnet.ibm.com <mailto:zol...@linux.vnet.ibm.com>> ha
> scritto:
> >
> > Just loosing one disk doesn’t automagically delete it from CRUSH,
> > but in the output you had 10 disks listed, so there must be
> > something else going - did you delete the disk from the crush map as
> > well?
> >
> > Ceph waits by default 300 secs AFAIK to mark an OSD out after it
> > will start to recover.
> >
> >
> >> On 29 Jun 2016, at 10:42, Mario Giammarco <mgiamma...@gmail.com
> >> <mailto:mgiamma...@gmail.com>> wrote:
> >>
> >> I thank you for your reply so I can add my experience:
> >>
> >> 1) the other time this thing happened to me I had a cluster with
> >> min_size=2 and size=3 and the problem was the same. That time I
> >> put min_size=1 to recover the pool but it did not help. So I do
> >> not understand where is the advantage to put three copies when
> >> ceph can decide to discard all three.
> >> 2) I started with 11 hdds. The hard disk failed. Ceph waited
> >> forever for hard disk coming back. But hard disk is really
> >> completelly broken so I have followed the procedure to really
> >> delete from cluster. Anyway ceph did not recover.
> >> 3) I have 307 pgs more than 300 but it is due to the fact that I
> >> had 11 hdds now only 10. I will add more hdds after I repair the
> pool
> >> 4) I have reduced the monitors to 3
> >>
> >>
> >>
> >> Il giorno mer 29 giu 2016 alle ore 10:25 Christian Balzer
> >> <ch...@gol.com <mailto:ch...@gol.com>> ha scritto:
> >>
> >>
> >> Hello,
> >>
> >> On Wed, 29 Jun 2016 06:02:59 + Mario Giammarco wrote:
&g

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
Yes I have removed it from crush because it was broken. I have waited 24
hours to see if cephs would like to heals itself. Then I removed the disk
completely (it was broken...) and I waited 24 hours again. Then I start
getting worried.
Are you saying to me that I should not remove a broken disk from cluster?
24 hours were not enough?

Il giorno mer 29 giu 2016 alle ore 10:53 Zoltan Arnold Nagy <
zol...@linux.vnet.ibm.com> ha scritto:

> Just loosing one disk doesn’t automagically delete it from CRUSH, but in
> the output you had 10 disks listed, so there must be something else going -
> did you delete the disk from the crush map as well?
>
> Ceph waits by default 300 secs AFAIK to mark an OSD out after it will
> start to recover.
>
>
> On 29 Jun 2016, at 10:42, Mario Giammarco <mgiamma...@gmail.com> wrote:
>
> I thank you for your reply so I can add my experience:
>
> 1) the other time this thing happened to me I had a cluster with
> min_size=2 and size=3 and the problem was the same. That time I put
> min_size=1 to recover the pool but it did not help. So I do not understand
> where is the advantage to put three copies when ceph can decide to discard
> all three.
> 2) I started with 11 hdds. The hard disk failed. Ceph waited forever for
> hard disk coming back. But hard disk is really completelly broken so I have
> followed the procedure to really delete from cluster. Anyway ceph did not
> recover.
> 3) I have 307 pgs more than 300 but it is due to the fact that I had 11
> hdds now only 10. I will add more hdds after I repair the pool
> 4) I have reduced the monitors to 3
>
>
>
> Il giorno mer 29 giu 2016 alle ore 10:25 Christian Balzer <ch...@gol.com>
> ha scritto:
>
>>
>> Hello,
>>
>> On Wed, 29 Jun 2016 06:02:59 + Mario Giammarco wrote:
>>
>> > pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>>^
>> And that's the root cause of all your woes.
>> The default replication size is 3 for a reason and while I do run pools
>> with replication of 2 they are either HDD RAIDs or extremely trustworthy
>> and well monitored SSD.
>>
>> That said, something more than a single HDD failure must have happened
>> here, you should check the logs and backtrace all the step you did after
>> that OSD failed.
>>
>> You said there were 11 HDDs and your first ceph -s output showed:
>> ---
>>  osdmap e10182: 10 osds: 10 up, 10 in
>> 
>> And your crush map states the same.
>>
>> So how and WHEN did you remove that OSD?
>> My suspicion would be it was removed before recovery was complete.
>>
>> Also, as I think was mentioned before, 7 mons are overkill 3-5 would be a
>> saner number.
>>
>> Christian
>>
>> > rjenkins pg_num 512 pgp_num 512 last_change 9313 flags hashpspool
>> > stripe_width 0
>> >removed_snaps [1~3]
>> > pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>> > rjenkins pg_num 512 pgp_num 512 last_change 9314 flags hashpspool
>> > stripe_width 0
>> >removed_snaps [1~3]
>> > pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>> > rjenkins pg_num 512 pgp_num 512 last_change 10537 flags hashpspool
>> > stripe_width 0
>> >removed_snaps [1~3]
>> >
>> >
>> > ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
>> > 5 1.81000  1.0  1857G  984G  872G 53.00 0.86
>> > 6 1.81000  1.0  1857G 1202G  655G 64.73 1.05
>> > 2 1.81000  1.0  1857G 1158G  698G 62.38 1.01
>> > 3 1.35999  1.0  1391G  906G  485G 65.12 1.06
>> > 4 0.8  1.0   926G  702G  223G 75.88 1.23
>> > 7 1.81000  1.0  1857G 1063G  793G 57.27 0.93
>> > 8 1.81000  1.0  1857G 1011G  846G 54.44 0.88
>> > 9 0.8  1.0   926G  573G  352G 61.91 1.01
>> > 0 1.81000  1.0  1857G 1227G  629G 66.10 1.07
>> > 13 0.45000  1.0   460G  307G  153G 66.74 1.08
>> >  TOTAL 14846G 9136G 5710G 61.54
>> > MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47
>> >
>> >
>> >
>> > ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>> >
>> > http://pastebin.com/SvGfcSHb
>> > http://pastebin.com/gYFatsNS
>> > http://pastebin.com/VZD7j2vN
>> >
>> > I do not understand why I/O on ENTIRE cluster is blocked when only few
>> > pgs are incomplete.
>> >
>> > Many thanks,
>> > Mario
>> >
>> >
>> > Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe - Profiho

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
I thank you for your reply so I can add my experience:

1) the other time this thing happened to me I had a cluster with min_size=2
and size=3 and the problem was the same. That time I put min_size=1 to
recover the pool but it did not help. So I do not understand where is the
advantage to put three copies when ceph can decide to discard all three.
2) I started with 11 hdds. The hard disk failed. Ceph waited forever for
hard disk coming back. But hard disk is really completelly broken so I have
followed the procedure to really delete from cluster. Anyway ceph did not
recover.
3) I have 307 pgs more than 300 but it is due to the fact that I had 11
hdds now only 10. I will add more hdds after I repair the pool
4) I have reduced the monitors to 3



Il giorno mer 29 giu 2016 alle ore 10:25 Christian Balzer <ch...@gol.com>
ha scritto:

>
> Hello,
>
> On Wed, 29 Jun 2016 06:02:59 +0000 Mario Giammarco wrote:
>
> > pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>^
> And that's the root cause of all your woes.
> The default replication size is 3 for a reason and while I do run pools
> with replication of 2 they are either HDD RAIDs or extremely trustworthy
> and well monitored SSD.
>
> That said, something more than a single HDD failure must have happened
> here, you should check the logs and backtrace all the step you did after
> that OSD failed.
>
> You said there were 11 HDDs and your first ceph -s output showed:
> ---
>  osdmap e10182: 10 osds: 10 up, 10 in
> 
> And your crush map states the same.
>
> So how and WHEN did you remove that OSD?
> My suspicion would be it was removed before recovery was complete.
>
> Also, as I think was mentioned before, 7 mons are overkill 3-5 would be a
> saner number.
>
> Christian
>
> > rjenkins pg_num 512 pgp_num 512 last_change 9313 flags hashpspool
> > stripe_width 0
> >removed_snaps [1~3]
> > pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> > rjenkins pg_num 512 pgp_num 512 last_change 9314 flags hashpspool
> > stripe_width 0
> >removed_snaps [1~3]
> > pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> > rjenkins pg_num 512 pgp_num 512 last_change 10537 flags hashpspool
> > stripe_width 0
> >removed_snaps [1~3]
> >
> >
> > ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
> > 5 1.81000  1.0  1857G  984G  872G 53.00 0.86
> > 6 1.81000  1.0  1857G 1202G  655G 64.73 1.05
> > 2 1.81000  1.0  1857G 1158G  698G 62.38 1.01
> > 3 1.35999  1.0  1391G  906G  485G 65.12 1.06
> > 4 0.8  1.0   926G  702G  223G 75.88 1.23
> > 7 1.81000  1.0  1857G 1063G  793G 57.27 0.93
> > 8 1.81000  1.0  1857G 1011G  846G 54.44 0.88
> > 9 0.8  1.0   926G  573G  352G 61.91 1.01
> > 0 1.81000  1.0  1857G 1227G  629G 66.10 1.07
> > 13 0.45000  1.0   460G  307G  153G 66.74 1.08
> >  TOTAL 14846G 9136G 5710G 61.54
> > MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47
> >
> >
> >
> > ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
> >
> > http://pastebin.com/SvGfcSHb
> > http://pastebin.com/gYFatsNS
> > http://pastebin.com/VZD7j2vN
> >
> > I do not understand why I/O on ENTIRE cluster is blocked when only few
> > pgs are incomplete.
> >
> > Many thanks,
> > Mario
> >
> >
> > Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe - Profihost AG <
> > s.pri...@profihost.ag> ha scritto:
> >
> > > And ceph health detail
> > >
> > > Stefan
> > >
> > > Excuse my typo sent from my mobile phone.
> > >
> > > Am 28.06.2016 um 19:28 schrieb Oliver Dzombic <i...@ip-interactive.de
> >:
> > >
> > > Hi Mario,
> > >
> > > please give some more details:
> > >
> > > Please the output of:
> > >
> > > ceph osd pool ls detail
> > > ceph osd df
> > > ceph --version
> > >
> > > ceph -w for 10 seconds ( use http://pastebin.com/ please )
> > >
> > > ceph osd crush dump ( also pastebin pls )
> > >
> > > --
> > > Mit freundlichen Gruessen / Best regards
> > >
> > > Oliver Dzombic
> > > IP-Interactive
> > >
> > > mailto:i...@ip-interactive.de <i...@ip-interactive.de>
> > >
> > > Anschrift:
> > >
> > > IP Interactive UG ( haftungsbeschraenkt )
> > > Zum Sonnenberg 1-3
> > > 63571 Gelnhausen
> > >
> > > HRB 934

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
I have searched google and I see that there is no official procedure.

Il giorno mer 29 giu 2016 alle ore 09:43 Mario Giammarco <
mgiamma...@gmail.com> ha scritto:

> I have read many times the post "incomplete pgs, oh my"
> I think my case is different.
> The broken disk is completely broken.
> So how can I simply mark incomplete pgs as complete?
> Should I stop ceph before?
>
>
> Il giorno mer 29 giu 2016 alle ore 09:36 Tomasz Kuzemko <
> tomasz.kuze...@corp.ovh.com> ha scritto:
>
>> Hi,
>> if you need fast access to your remaining data you can use
>> ceph-objectstore-tool to mark those PGs as complete, however this will
>> irreversibly lose the missing data.
>>
>> If you understand the risks, this procedure is pretty good explained here:
>> http://ceph.com/community/incomplete-pgs-oh-my/
>>
>> Since this article was written, ceph-objectstore-tool gained a feature
>> that was not available at that time, that is "--op mark-complete". I
>> think it will be necessary in your case to call --op mark-complete after
>> you import the PG to temporary OSD (between steps 12 and 13).
>>
>> On 29.06.2016 09:09, Mario Giammarco wrote:
>> > Now I have also discovered that, by mistake, someone has put production
>> > data on a virtual machine of the cluster. I need that ceph starts I/O so
>> > I can boot that virtual machine.
>> > Can I mark the incomplete pgs as valid?
>> > If needed, where can I buy some paid support?
>> > Thanks again,
>> > Mario
>> >
>> > Il giorno mer 29 giu 2016 alle ore 08:02 Mario Giammarco
>> > <mgiamma...@gmail.com <mailto:mgiamma...@gmail.com>> ha scritto:
>> >
>> > pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0
>> > object_hash rjenkins pg_num 512 pgp_num 512 last_change 9313 flags
>> > hashpspool stripe_width 0
>> >removed_snaps [1~3]
>> > pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0
>> > object_hash rjenkins pg_num 512 pgp_num 512 last_change 9314 flags
>> > hashpspool stripe_width 0
>> >removed_snaps [1~3]
>> > pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0
>> > object_hash rjenkins pg_num 512 pgp_num 512 last_change 10537 flags
>> > hashpspool stripe_width 0
>> >removed_snaps [1~3]
>> >
>> >
>> > ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
>> > 5 1.81000  1.0  1857G  984G  872G 53.00 0.86
>> > 6 1.81000  1.0  1857G 1202G  655G 64.73 1.05
>> > 2 1.81000  1.0  1857G 1158G  698G 62.38 1.01
>> > 3 1.35999  1.0  1391G  906G  485G 65.12 1.06
>> > 4 0.8  1.0   926G  702G  223G 75.88 1.23
>> > 7 1.81000  1.0  1857G 1063G  793G 57.27 0.93
>> > 8 1.81000  1.0  1857G 1011G  846G 54.44 0.88
>> > 9 0.8  1.0   926G  573G  352G 61.91 1.01
>> > 0 1.81000  1.0  1857G 1227G  629G 66.10 1.07
>> > 13 0.45000  1.0   460G  307G  153G 66.74 1.08
>> >  TOTAL 14846G 9136G 5710G 61.54
>> > MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47
>> >
>> >
>> >
>> > ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>> >
>> > http://pastebin.com/SvGfcSHb
>> > http://pastebin.com/gYFatsNS
>> > http://pastebin.com/VZD7j2vN
>> >
>> > I do not understand why I/O on ENTIRE cluster is blocked when only
>> > few pgs are incomplete.
>> >
>> > Many thanks,
>> > Mario
>> >
>> >
>> > Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe - Profihost
>> > AG <s.pri...@profihost.ag <mailto:s.pri...@profihost.ag>> ha
>> scritto:
>> >
>> > And ceph health detail
>> >
>> > Stefan
>> >
>> > Excuse my typo sent from my mobile phone.
>> >
>> > Am 28.06.2016 um 19:28 schrieb Oliver Dzombic
>> > <i...@ip-interactive.de <mailto:i...@ip-interactive.de>>:
>> >
>> >> Hi Mario,
>> >>
>> >> please give some more details:
>> >>
>> >> Please the output of:
>> >>
>> >> ceph osd pool ls detail
>> >> ceph osd df
>> >> ceph --version
>> >>
>> >> ceph -w for 10 seconds ( use http://pastebin.com/

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
Now I have also discovered that, by mistake, someone has put production
data on a virtual machine of the cluster. I need that ceph starts I/O so I
can boot that virtual machine.
Can I mark the incomplete pgs as valid?
If needed, where can I buy some paid support?
Thanks again,
Mario

Il giorno mer 29 giu 2016 alle ore 08:02 Mario Giammarco <
mgiamma...@gmail.com> ha scritto:

> pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 512 pgp_num 512 last_change 9313 flags hashpspool
> stripe_width 0
>removed_snaps [1~3]
> pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 512 pgp_num 512 last_change 9314 flags hashpspool
> stripe_width 0
>removed_snaps [1~3]
> pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 512 pgp_num 512 last_change 10537 flags hashpspool
> stripe_width 0
>removed_snaps [1~3]
>
>
> ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
> 5 1.81000  1.0  1857G  984G  872G 53.00 0.86
> 6 1.81000  1.0  1857G 1202G  655G 64.73 1.05
> 2 1.81000  1.0  1857G 1158G  698G 62.38 1.01
> 3 1.35999  1.0  1391G  906G  485G 65.12 1.06
> 4 0.8  1.0   926G  702G  223G 75.88 1.23
> 7 1.81000  1.0  1857G 1063G  793G 57.27 0.93
> 8 1.81000  1.0  1857G 1011G  846G 54.44 0.88
> 9 0.8  1.0   926G  573G  352G 61.91 1.01
> 0 1.81000  1.0  1857G 1227G  629G 66.10 1.07
> 13 0.45000  1.0   460G  307G  153G 66.74 1.08
>  TOTAL 14846G 9136G 5710G 61.54
> MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47
>
>
>
> ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>
> http://pastebin.com/SvGfcSHb
> http://pastebin.com/gYFatsNS
> http://pastebin.com/VZD7j2vN
>
> I do not understand why I/O on ENTIRE cluster is blocked when only few pgs
> are incomplete.
>
> Many thanks,
> Mario
>
>
> Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe - Profihost AG <
> s.pri...@profihost.ag> ha scritto:
>
>> And ceph health detail
>>
>> Stefan
>>
>> Excuse my typo sent from my mobile phone.
>>
>> Am 28.06.2016 um 19:28 schrieb Oliver Dzombic <i...@ip-interactive.de>:
>>
>> Hi Mario,
>>
>> please give some more details:
>>
>> Please the output of:
>>
>> ceph osd pool ls detail
>> ceph osd df
>> ceph --version
>>
>> ceph -w for 10 seconds ( use http://pastebin.com/ please )
>>
>> ceph osd crush dump ( also pastebin pls )
>>
>> --
>> Mit freundlichen Gruessen / Best regards
>>
>> Oliver Dzombic
>> IP-Interactive
>>
>> mailto:i...@ip-interactive.de <i...@ip-interactive.de>
>>
>> Anschrift:
>>
>> IP Interactive UG ( haftungsbeschraenkt )
>> Zum Sonnenberg 1-3
>> 63571 Gelnhausen
>>
>> HRB 93402 beim Amtsgericht Hanau
>> Geschäftsführung: Oliver Dzombic
>>
>> Steuer Nr.: 35 236 3622 1
>> UST ID: DE274086107
>>
>>
>> Am 28.06.2016 um 18:59 schrieb Mario Giammarco:
>>
>> Hello,
>>
>> this is the second time that happens to me, I hope that someone can
>>
>> explain what I can do.
>>
>> Proxmox ceph cluster with 8 servers, 11 hdd. Min_size=1, size=2.
>>
>>
>> One hdd goes down due to bad sectors.
>>
>> Ceph recovers but it ends with:
>>
>>
>> cluster f2a8dd7d-949a-4a29-acab-11d4900249f4
>>
>> health HEALTH_WARN
>>
>>3 pgs down
>>
>>19 pgs incomplete
>>
>>19 pgs stuck inactive
>>
>>19 pgs stuck unclean
>>
>>7 requests are blocked > 32 sec
>>
>> monmap e11: 7 mons at
>>
>> {0=192.168.0.204:6789/0,1=192.168.0.201:6789/0,
>>
>> 2=192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202:
>>
>> 6789/0,5=192.168.0.206:6789/0,6=192.168.0.207:6789/0}
>>
>>election epoch 722, quorum
>>
>> 0,1,2,3,4,5,6 1,4,2,0,3,5,6
>>
>> osdmap e10182: 10 osds: 10 up, 10 in
>>
>>  pgmap v3295880: 1024 pgs, 2 pools, 4563 GB data, 1143 kobjects
>>
>>9136 GB used, 5710 GB / 14846 GB avail
>>
>>1005 active+clean
>>
>>  16 incomplete
>>
>>   3 down+incomplete
>>
>>
>> Unfortunately "7 requests blocked" means no virtual machine can boot
>>
>> because ceph has stopped i/o.
>>
>>
>> I can accept to lose some data, but not ALL data!
>>
>> Can you help me please?
>>
>> Thanks,
>>
>> Mario
>>
>>
>> ___
>>
>> ceph-users mailing list
>>
>> ceph-users@lists.ceph.com
>>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Mario Giammarco
pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 512 pgp_num 512 last_change 9313 flags hashpspool
stripe_width 0
   removed_snaps [1~3]
pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 512 pgp_num 512 last_change 9314 flags hashpspool
stripe_width 0
   removed_snaps [1~3]
pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 512 pgp_num 512 last_change 10537 flags hashpspool
stripe_width 0
   removed_snaps [1~3]


ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
5 1.81000  1.0  1857G  984G  872G 53.00 0.86
6 1.81000  1.0  1857G 1202G  655G 64.73 1.05
2 1.81000  1.0  1857G 1158G  698G 62.38 1.01
3 1.35999  1.0  1391G  906G  485G 65.12 1.06
4 0.8  1.0   926G  702G  223G 75.88 1.23
7 1.81000  1.0  1857G 1063G  793G 57.27 0.93
8 1.81000  1.0  1857G 1011G  846G 54.44 0.88
9 0.8  1.0   926G  573G  352G 61.91 1.01
0 1.81000  1.0  1857G 1227G  629G 66.10 1.07
13 0.45000  1.0   460G  307G  153G 66.74 1.08
 TOTAL 14846G 9136G 5710G 61.54
MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47



ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)

http://pastebin.com/SvGfcSHb
http://pastebin.com/gYFatsNS
http://pastebin.com/VZD7j2vN

I do not understand why I/O on ENTIRE cluster is blocked when only few pgs
are incomplete.

Many thanks,
Mario


Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe - Profihost AG <
s.pri...@profihost.ag> ha scritto:

> And ceph health detail
>
> Stefan
>
> Excuse my typo sent from my mobile phone.
>
> Am 28.06.2016 um 19:28 schrieb Oliver Dzombic <i...@ip-interactive.de>:
>
> Hi Mario,
>
> please give some more details:
>
> Please the output of:
>
> ceph osd pool ls detail
> ceph osd df
> ceph --version
>
> ceph -w for 10 seconds ( use http://pastebin.com/ please )
>
> ceph osd crush dump ( also pastebin pls )
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:i...@ip-interactive.de <i...@ip-interactive.de>
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
>
>
> Am 28.06.2016 um 18:59 schrieb Mario Giammarco:
>
> Hello,
>
> this is the second time that happens to me, I hope that someone can
>
> explain what I can do.
>
> Proxmox ceph cluster with 8 servers, 11 hdd. Min_size=1, size=2.
>
>
> One hdd goes down due to bad sectors.
>
> Ceph recovers but it ends with:
>
>
> cluster f2a8dd7d-949a-4a29-acab-11d4900249f4
>
> health HEALTH_WARN
>
>3 pgs down
>
>19 pgs incomplete
>
>19 pgs stuck inactive
>
>19 pgs stuck unclean
>
>7 requests are blocked > 32 sec
>
> monmap e11: 7 mons at
>
> {0=192.168.0.204:6789/0,1=192.168.0.201:6789/0,
>
> 2=192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202:
>
> 6789/0,5=192.168.0.206:6789/0,6=192.168.0.207:6789/0}
>
>election epoch 722, quorum
>
> 0,1,2,3,4,5,6 1,4,2,0,3,5,6
>
> osdmap e10182: 10 osds: 10 up, 10 in
>
>  pgmap v3295880: 1024 pgs, 2 pools, 4563 GB data, 1143 kobjects
>
>9136 GB used, 5710 GB / 14846 GB avail
>
>1005 active+clean
>
>  16 incomplete
>
>   3 down+incomplete
>
>
> Unfortunately "7 requests blocked" means no virtual machine can boot
>
> because ceph has stopped i/o.
>
>
> I can accept to lose some data, but not ALL data!
>
> Can you help me please?
>
> Thanks,
>
> Mario
>
>
> ___
>
> ceph-users mailing list
>
> ceph-users@lists.ceph.com
>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Another cluster completely hang

2016-06-28 Thread Mario Giammarco
Hello,
this is the second time that happens to me, I hope that someone can 
explain what I can do.
Proxmox ceph cluster with 8 servers, 11 hdd. Min_size=1, size=2.

One hdd goes down due to bad sectors. 
Ceph recovers but it ends with:

cluster f2a8dd7d-949a-4a29-acab-11d4900249f4
 health HEALTH_WARN
3 pgs down
19 pgs incomplete
19 pgs stuck inactive
19 pgs stuck unclean
7 requests are blocked > 32 sec
 monmap e11: 7 mons at
{0=192.168.0.204:6789/0,1=192.168.0.201:6789/0,
2=192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202:
6789/0,5=192.168.0.206:6789/0,6=192.168.0.207:6789/0}
election epoch 722, quorum 
0,1,2,3,4,5,6 1,4,2,0,3,5,6
 osdmap e10182: 10 osds: 10 up, 10 in
  pgmap v3295880: 1024 pgs, 2 pools, 4563 GB data, 1143 kobjects
9136 GB used, 5710 GB / 14846 GB avail
1005 active+clean
  16 incomplete
   3 down+incomplete

Unfortunately "7 requests blocked" means no virtual machine can boot 
because ceph has stopped i/o.

I can accept to lose some data, but not ALL data!
Can you help me please?
Thanks,
Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Help: pool not responding] Now osd crash

2016-03-08 Thread Mario Giammarco
h::buffer::list*)+0x4ab) [0x7c616b]
11: (OSD::load_pgs()+0xa20) [0x6a9170]
12: (OSD::init()+0xc84) [0x6ac204]
13: (main()+0x2839) [0x632459]
14: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
15: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS ` is needed to
interpret this.


2016-03-02 9:38 GMT+01:00 Mario Giammarco <mgiamma...@gmail.com>:

> Here it is:
>
>  cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
>  health HEALTH_WARN
> 4 pgs incomplete
> 4 pgs stuck inactive
> 4 pgs stuck unclean
> 1 requests are blocked > 32 sec
>  monmap e8: 3 mons at {0=
> 10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0}
> election epoch 840, quorum 0,1,2 0,1,2
>  osdmap e2405: 3 osds: 3 up, 3 in
>   pgmap v5904430: 288 pgs, 4 pools, 391 GB data, 100 kobjects
> 1090 GB used, 4481 GB / 5571 GB avail
>  284 active+clean
>4 incomplete
>   client io 4008 B/s rd, 446 kB/s wr, 23 op/s
>
>
> 2016-03-02 9:31 GMT+01:00 Shinobu Kinjo <ski...@redhat.com>:
>
>> Is "ceph -s" still showing you same output?
>>
>> > cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
>> >  health HEALTH_WARN
>> > 4 pgs incomplete
>> > 4 pgs stuck inactive
>> > 4 pgs stuck unclean
>> >  monmap e8: 3 mons at
>> > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0}
>> > election epoch 832, quorum 0,1,2 0,1,2
>> >  osdmap e2400: 3 osds: 3 up, 3 in
>> >   pgmap v5883297: 288 pgs, 4 pools, 391 GB data, 100 kobjects
>> > 1090 GB used, 4481 GB / 5571 GB avail
>> >  284 active+clean
>> >4 incomplete
>>
>> Cheers,
>> S
>>
>> - Original Message -
>> From: "Mario Giammarco" <mgiamma...@gmail.com>
>> To: "Lionel Bouton" <lionel-subscript...@bouton.name>
>> Cc: "Shinobu Kinjo" <ski...@redhat.com>, ceph-users@lists.ceph.com
>> Sent: Wednesday, March 2, 2016 4:27:15 PM
>> Subject: Re: [ceph-users] Help: pool not responding
>>
>> Tried to set min_size=1 but unfortunately nothing has changed.
>> Thanks for the idea.
>>
>> 2016-02-29 22:56 GMT+01:00 Lionel Bouton <lionel-subscript...@bouton.name
>> >:
>>
>> > Le 29/02/2016 22:50, Shinobu Kinjo a écrit :
>> >
>> > the fact that they are optimized for benchmarks and certainly not
>> > Ceph OSD usage patterns (with or without internal journal).
>> >
>> > Are you assuming that SSHD is causing the issue?
>> > If you could elaborate on this more, it would be helpful.
>> >
>> >
>> > Probably not (unless they reveal themselves extremely unreliable with
>> Ceph
>> > OSD usage patterns which would be surprising to me).
>> >
>> > For incomplete PG the documentation seems good enough for what should be
>> > done :
>> > http://docs.ceph.com/docs/master/rados/operations/pg-states/
>> >
>> > The relevant text:
>> >
>> > *Incomplete* Ceph detects that a placement group is missing information
>> > about writes that may have occurred, or does not have any healthy
>> copies.
>> > If you see this state, try to start any failed OSDs that may contain the
>> > needed information or temporarily adjust min_size to allow recovery.
>> >
>> > We don't have the full history but the most probable cause of these
>> > incomplete PGs is that min_size is set to 2 or 3 and at some time the 4
>> > incomplete pgs didn't have as many replica as the min_size value. So if
>> > setting min_size to 2 isn't enough setting it to 1 should unfreeze them.
>> >
>> > Lionel
>> >
>>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help: pool not responding

2016-03-05 Thread Mario Giammarco
Tried in all ways to recover pool (putting also osd out, scrub, etc.)
If there is no way to reset that four pgs or to understand why they are not
repariring themselves I will destroy the pool.
But destroying an entire pool only to unblock 4 pgs that are incomplete is
incredible.

Mario

2016-03-03 21:51 GMT+01:00 Dimitar Boichev <dimitar.boic...@axsmarine.com>:

> But the whole cluster or what ?
>
> Regards.
>
> *Dimitar Boichev*
> SysAdmin Team Lead
> AXSMarine Sofia
> Phone: +359 889 22 55 42
> Skype: dimitar.boichev.axsmarine
> E-mail: dimitar.boic...@axsmarine.com
>
> On Mar 3, 2016, at 22:47, Mario Giammarco <mgiamma...@gmail.com> wrote:
>
> Uses init script to restart
>
> *Da: *Dimitar Boichev
> *Inviato: *giovedì 3 marzo 2016 21:44
> *A: *Mario Giammarco
> *Cc: *Oliver Dzombic; ceph-users@lists.ceph.com
> *Oggetto: *Re: [ceph-users] Help: pool not responding
>
> I see a lot of people (including myself) ending with PGs that are stuck in
> “creating” state when you force create them.
>
> How did you restart ceph ?
> Mine were created fine after I restarted the monitor nodes after a minor
> version upgrade.
> Did you do it monitors firs, osds second, etc etc …..
>
> Regards.
>
>
> On Mar 3, 2016, at 13:13, Mario Giammarco <mgiamma...@gmail.com> wrote:
>
> I have tried "force create". It says "creating" but at the end problem
> persists.
> I have restarted ceph as usual.
> I am evaluating ceph and I am shocked because it semeed a very robust
> filesystem and now for a glitch I have an entire pool blocked and there is
> no simple procedure to force a recovery.
>
> 2016-03-02 18:31 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de>:
>
>> Hi,
>>
>> i could also not find any delete, but a create.
>>
>> I found this here, its basically your situation:
>>
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-July/032412.html
>>
>> --
>> Mit freundlichen Gruessen / Best regards
>>
>> Oliver Dzombic
>> IP-Interactive
>>
>> mailto:i...@ip-interactive.de
>>
>> Anschrift:
>>
>> IP Interactive UG ( haftungsbeschraenkt )
>> Zum Sonnenberg 1-3
>> 63571 Gelnhausen
>>
>> HRB 93402 beim Amtsgericht Hanau
>> Geschäftsführung: Oliver Dzombic
>>
>> Steuer Nr.: 35 236 3622 1
>> UST ID: DE274086107
>>
>>
>> Am 02.03.2016 um 18:28 schrieb Mario Giammarco:
>> > Thans for info even if it is a bad info.
>> > Anyway I am reading docs again and I do not see a way to delete PGs.
>> > How can I remove them?
>> > Thanks,
>> > Mario
>> >
>> > 2016-03-02 17:59 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de
>> > <mailto:i...@ip-interactive.de>>:
>> >
>> > Hi,
>> >
>> > as i see your situation, somehow this 4 pg's got lost.
>> >
>> > They will not recover, because they are incomplete. So there is no
>> data
>> > from which it could be recovered.
>> >
>> > So all what is left is to delete this pg's.
>> >
>> > Since all 3 osd's are in and up, it does not seem like you can
>> somehow
>> > access this lost pg's.
>> >
>> > --
>> > Mit freundlichen Gruessen / Best regards
>> >
>> > Oliver Dzombic
>> > IP-Interactive
>> >
>> > mailto:i...@ip-interactive.de <mailto:i...@ip-interactive.de>
>> >
>> > Anschrift:
>> >
>> > IP Interactive UG ( haftungsbeschraenkt )
>> > Zum Sonnenberg 1-3
>> > 63571 Gelnhausen
>> >
>> > HRB 93402 beim Amtsgericht Hanau
>> > Geschäftsführung: Oliver Dzombic
>> >
>> > Steuer Nr.: 35 236 3622 1 <tel:35%20236%203622%201>
>> > UST ID: DE274086107
>> >
>> >
>> > Am 02.03.2016  um 17:45 schrieb Mario Giammarco:
>> > >
>> > >
>> > > Here it is:
>> > >
>> > >  cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
>> > >  health HEALTH_WARN
>> > > 4 pgs incomplete
>> > > 4 pgs stuck inactive
>> > > 4 pgs stuck unclean
>> > > 1 requests are blocked > 32 sec
>> > >  monmap e8: 3 mons at
>> > > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
>> > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>
&

Re: [ceph-users] Help: pool not responding

2016-03-04 Thread Mario Giammarco
I have restarted each host using init scripts. Is there another way?

2016-03-03 21:51 GMT+01:00 Dimitar Boichev <dimitar.boic...@axsmarine.com>:

> But the whole cluster or what ?
>
> Regards.
>
> *Dimitar Boichev*
> SysAdmin Team Lead
> AXSMarine Sofia
> Phone: +359 889 22 55 42
> Skype: dimitar.boichev.axsmarine
> E-mail: dimitar.boic...@axsmarine.com
>
> On Mar 3, 2016, at 22:47, Mario Giammarco <mgiamma...@gmail.com> wrote:
>
> Uses init script to restart
>
> *Da: *Dimitar Boichev
> *Inviato: *giovedì 3 marzo 2016 21:44
> *A: *Mario Giammarco
> *Cc: *Oliver Dzombic; ceph-users@lists.ceph.com
> *Oggetto: *Re: [ceph-users] Help: pool not responding
>
> I see a lot of people (including myself) ending with PGs that are stuck in
> “creating” state when you force create them.
>
> How did you restart ceph ?
> Mine were created fine after I restarted the monitor nodes after a minor
> version upgrade.
> Did you do it monitors firs, osds second, etc etc …..
>
> Regards.
>
>
> On Mar 3, 2016, at 13:13, Mario Giammarco <mgiamma...@gmail.com> wrote:
>
> I have tried "force create". It says "creating" but at the end problem
> persists.
> I have restarted ceph as usual.
> I am evaluating ceph and I am shocked because it semeed a very robust
> filesystem and now for a glitch I have an entire pool blocked and there is
> no simple procedure to force a recovery.
>
> 2016-03-02 18:31 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de>:
>
>> Hi,
>>
>> i could also not find any delete, but a create.
>>
>> I found this here, its basically your situation:
>>
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-July/032412.html
>>
>> --
>> Mit freundlichen Gruessen / Best regards
>>
>> Oliver Dzombic
>> IP-Interactive
>>
>> mailto:i...@ip-interactive.de
>>
>> Anschrift:
>>
>> IP Interactive UG ( haftungsbeschraenkt )
>> Zum Sonnenberg 1-3
>> 63571 Gelnhausen
>>
>> HRB 93402 beim Amtsgericht Hanau
>> Geschäftsführung: Oliver Dzombic
>>
>> Steuer Nr.: 35 236 3622 1
>> UST ID: DE274086107
>>
>>
>> Am 02.03.2016 um 18:28 schrieb Mario Giammarco:
>> > Thans for info even if it is a bad info.
>> > Anyway I am reading docs again and I do not see a way to delete PGs.
>> > How can I remove them?
>> > Thanks,
>> > Mario
>> >
>> > 2016-03-02 17:59 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de
>> > <mailto:i...@ip-interactive.de>>:
>> >
>> > Hi,
>> >
>> > as i see your situation, somehow this 4 pg's got lost.
>> >
>> > They will not recover, because they are incomplete. So there is no
>> data
>> > from which it could be recovered.
>> >
>> > So all what is left is to delete this pg's.
>> >
>> > Since all 3 osd's are in and up, it does not seem like you can
>> somehow
>> > access this lost pg's.
>> >
>> > --
>> > Mit freundlichen Gruessen / Best regards
>> >
>> > Oliver Dzombic
>> > IP-Interactive
>> >
>> > mailto:i...@ip-interactive.de <mailto:i...@ip-interactive.de>
>> >
>> > Anschrift:
>> >
>> > IP Interactive UG ( haftungsbeschraenkt )
>> > Zum Sonnenberg 1-3
>> > 63571 Gelnhausen
>> >
>> > HRB 93402 beim Amtsgericht Hanau
>> > Geschäftsführung: Oliver Dzombic
>> >
>> > Steuer Nr.: 35 236 3622 1 <tel:35%20236%203622%201>
>> > UST ID: DE274086107
>> >
>> >
>> > Am 02.03.2016  um 17:45 schrieb Mario Giammarco:
>> > >
>> > >
>> > > Here it is:
>> > >
>> > >  cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
>> > >  health HEALTH_WARN
>> > > 4 pgs incomplete
>> > > 4 pgs stuck inactive
>> > > 4 pgs stuck unclean
>> > > 1 requests are blocked > 32 sec
>> > >  monmap e8: 3 mons at
>> > > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
>> > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>
>> > > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>}
>> > > election epoch 840, quorum 0,1,2 0,1,2
>> > >  osdmap e2405: 3 o

Re: [ceph-users] Fwd: Help: pool not responding

2016-03-03 Thread Mario Giammarco
I have tried "force create". It says "creating" but at the end problem
persists.
I have restarted ceph as usual.
I am evaluating ceph and I am shocked because it semeed a very robust
filesystem and now for a glitch I have an entire pool blocked and there is
no simple procedure to force a recovery.

2016-03-02 18:31 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de>:

> Hi,
>
> i could also not find any delete, but a create.
>
> I found this here, its basically your situation:
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-July/032412.html
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:i...@ip-interactive.de
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
>
>
> Am 02.03.2016 um 18:28 schrieb Mario Giammarco:
> > Thans for info even if it is a bad info.
> > Anyway I am reading docs again and I do not see a way to delete PGs.
> > How can I remove them?
> > Thanks,
> > Mario
> >
> > 2016-03-02 17:59 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de
> > <mailto:i...@ip-interactive.de>>:
> >
> > Hi,
> >
> > as i see your situation, somehow this 4 pg's got lost.
> >
> > They will not recover, because they are incomplete. So there is no
> data
> > from which it could be recovered.
> >
> > So all what is left is to delete this pg's.
> >
> > Since all 3 osd's are in and up, it does not seem like you can
> somehow
> > access this lost pg's.
> >
> > --
> > Mit freundlichen Gruessen / Best regards
> >
> > Oliver Dzombic
> > IP-Interactive
> >
> > mailto:i...@ip-interactive.de <mailto:i...@ip-interactive.de>
> >
> > Anschrift:
> >
> > IP Interactive UG ( haftungsbeschraenkt )
> > Zum Sonnenberg 1-3
> > 63571 Gelnhausen
> >
> > HRB 93402 beim Amtsgericht Hanau
> > Geschäftsführung: Oliver Dzombic
> >
> > Steuer Nr.: 35 236 3622 1 <tel:35%20236%203622%201>
> > UST ID: DE274086107
> >
> >
> > Am 02.03.2016  um 17:45 schrieb Mario Giammarco:
> > >
> > >
> > > Here it is:
> > >
> > >  cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
> > >  health HEALTH_WARN
> > > 4 pgs incomplete
> > > 4 pgs stuck inactive
> > > 4 pgs stuck unclean
> > > 1 requests are blocked > 32 sec
> > >  monmap e8: 3 mons at
> > > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
> > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>
> > > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>}
> > > election epoch 840, quorum 0,1,2 0,1,2
> > >  osdmap e2405: 3 osds: 3 up, 3 in
> > >   pgmap v5904430: 288 pgs, 4 pools, 391 GB data, 100 kobjects
> > > 1090 GB used, 4481 GB / 5571 GB avail
> > >  284 active+clean
> > >4 incomplete
> > >   client io 4008 B/s rd, 446 kB/s wr, 23 op/s
> > >
> > >
> > > 2016-03-02 9:31 GMT+01:00 Shinobu Kinjo <ski...@redhat.com
> > <mailto:ski...@redhat.com>
> > > <mailto:ski...@redhat.com <mailto:ski...@redhat.com>>>:
> > >
> > > Is "ceph -s" still showing you same output?
> > >
> > > > cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
> > > >  health HEALTH_WARN
> > > > 4 pgs incomplete
> > > > 4 pgs stuck inactive
> > > > 4 pgs stuck unclean
> > > >  monmap e8: 3 mons at
> > > > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
> > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>
> > > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
> >}
> > > > election epoch 832, quorum 0,1,2 0,1,2
> > > >  osdmap e2400: 3 osds: 3 up, 3 in
> > > >   pgmap v5883297: 288 pgs, 4 pools, 391 GB data, 100
> > kobjects

Re: [ceph-users] Fwd: Help: pool not responding

2016-03-02 Thread Mario Giammarco
Thans for info even if it is a bad info.
Anyway I am reading docs again and I do not see a way to delete PGs.
How can I remove them?
Thanks,
Mario

2016-03-02 17:59 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de>:

> Hi,
>
> as i see your situation, somehow this 4 pg's got lost.
>
> They will not recover, because they are incomplete. So there is no data
> from which it could be recovered.
>
> So all what is left is to delete this pg's.
>
> Since all 3 osd's are in and up, it does not seem like you can somehow
> access this lost pg's.
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:i...@ip-interactive.de
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
>
>
> Am 02.03.2016 um 17:45 schrieb Mario Giammarco:
> >
> >
> > Here it is:
> >
> >  cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
> >  health HEALTH_WARN
> > 4 pgs incomplete
> > 4 pgs stuck inactive
> > 4 pgs stuck unclean
> > 1 requests are blocked > 32 sec
> >  monmap e8: 3 mons at
> > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
> > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>}
> > election epoch 840, quorum 0,1,2 0,1,2
> >  osdmap e2405: 3 osds: 3 up, 3 in
> >   pgmap v5904430: 288 pgs, 4 pools, 391 GB data, 100 kobjects
> > 1090 GB used, 4481 GB / 5571 GB avail
> >  284 active+clean
> >4 incomplete
> >   client io 4008 B/s rd, 446 kB/s wr, 23 op/s
> >
> >
> > 2016-03-02 9:31 GMT+01:00 Shinobu Kinjo <ski...@redhat.com
> > <mailto:ski...@redhat.com>>:
> >
> > Is "ceph -s" still showing you same output?
> >
> > > cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
> > >  health HEALTH_WARN
> > > 4 pgs incomplete
> > > 4 pgs stuck inactive
> > > 4 pgs stuck unclean
> > >  monmap e8: 3 mons at
> > > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
> > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>}
> > > election epoch 832, quorum 0,1,2 0,1,2
> > >  osdmap e2400: 3 osds: 3 up, 3 in
> > >   pgmap v5883297: 288 pgs, 4 pools, 391 GB data, 100 kobjects
> > > 1090 GB used, 4481 GB / 5571 GB avail
> > >  284 active+clean
> > >4 incomplete
> >
> > Cheers,
> > S
> >
> > - Original Message -
> > From: "Mario Giammarco" <mgiamma...@gmail.com
> > <mailto:mgiamma...@gmail.com>>
> > To: "Lionel Bouton" <lionel-subscript...@bouton.name
> > <mailto:lionel-subscript...@bouton.name>>
> > Cc: "Shinobu Kinjo" <ski...@redhat.com <mailto:ski...@redhat.com>>,
> > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > Sent: Wednesday, March 2, 2016 4:27:15 PM
> > Subject: Re: [ceph-users] Help: pool not responding
> >
> > Tried to set min_size=1 but unfortunately nothing has changed.
> > Thanks for the idea.
> >
> > 2016-02-29 22:56 GMT+01:00 Lionel Bouton
> > <lionel-subscript...@bouton.name
> > <mailto:lionel-subscript...@bouton.name>>:
> >
> > > Le 29/02/2016 22:50, Shinobu Kinjo a écrit :
> > >
> > > the fact that they are optimized for benchmarks and certainly not
> > > Ceph OSD usage patterns (with or without internal journal).
> > >
> > > Are you assuming that SSHD is causing the issue?
> > > If you could elaborate on this more, it would be helpful.
> > >
> > >
> > > Probably not (unless they reveal themselves extremely unreliable
> > with Ceph
> > > OSD usage patterns which would be surprising to me).
> > >
> > > For incomplete PG the documentation seems good enough for what
> > should be
> > > done :
> > > http://docs.ceph.com/docs/master/rados/operations/pg-states/
> > >
> > > The relevant text:
> > >
> > > *Incomplete* Cep

[ceph-users] Fwd: Help: pool not responding

2016-03-02 Thread Mario Giammarco
Here it is:

 cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
 health HEALTH_WARN
4 pgs incomplete
4 pgs stuck inactive
4 pgs stuck unclean
1 requests are blocked > 32 sec
 monmap e8: 3 mons at {0=
10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0}
election epoch 840, quorum 0,1,2 0,1,2
 osdmap e2405: 3 osds: 3 up, 3 in
  pgmap v5904430: 288 pgs, 4 pools, 391 GB data, 100 kobjects
1090 GB used, 4481 GB / 5571 GB avail
 284 active+clean
   4 incomplete
  client io 4008 B/s rd, 446 kB/s wr, 23 op/s


2016-03-02 9:31 GMT+01:00 Shinobu Kinjo <ski...@redhat.com>:

> Is "ceph -s" still showing you same output?
>
> > cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
> >  health HEALTH_WARN
> > 4 pgs incomplete
> > 4 pgs stuck inactive
> > 4 pgs stuck unclean
> >  monmap e8: 3 mons at
> > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0}
> > election epoch 832, quorum 0,1,2 0,1,2
> >  osdmap e2400: 3 osds: 3 up, 3 in
> >   pgmap v5883297: 288 pgs, 4 pools, 391 GB data, 100 kobjects
> > 1090 GB used, 4481 GB / 5571 GB avail
> >  284 active+clean
> >    4 incomplete
>
> Cheers,
> S
>
> - Original Message -
> From: "Mario Giammarco" <mgiamma...@gmail.com>
> To: "Lionel Bouton" <lionel-subscript...@bouton.name>
> Cc: "Shinobu Kinjo" <ski...@redhat.com>, ceph-users@lists.ceph.com
> Sent: Wednesday, March 2, 2016 4:27:15 PM
> Subject: Re: [ceph-users] Help: pool not responding
>
> Tried to set min_size=1 but unfortunately nothing has changed.
> Thanks for the idea.
>
> 2016-02-29 22:56 GMT+01:00 Lionel Bouton <lionel-subscript...@bouton.name
> >:
>
> > Le 29/02/2016 22:50, Shinobu Kinjo a écrit :
> >
> > the fact that they are optimized for benchmarks and certainly not
> > Ceph OSD usage patterns (with or without internal journal).
> >
> > Are you assuming that SSHD is causing the issue?
> > If you could elaborate on this more, it would be helpful.
> >
> >
> > Probably not (unless they reveal themselves extremely unreliable with
> Ceph
> > OSD usage patterns which would be surprising to me).
> >
> > For incomplete PG the documentation seems good enough for what should be
> > done :
> > http://docs.ceph.com/docs/master/rados/operations/pg-states/
> >
> > The relevant text:
> >
> > *Incomplete* Ceph detects that a placement group is missing information
> > about writes that may have occurred, or does not have any healthy copies.
> > If you see this state, try to start any failed OSDs that may contain the
> > needed information or temporarily adjust min_size to allow recovery.
> >
> > We don't have the full history but the most probable cause of these
> > incomplete PGs is that min_size is set to 2 or 3 and at some time the 4
> > incomplete pgs didn't have as many replica as the min_size value. So if
> > setting min_size to 2 isn't enough setting it to 1 should unfreeze them.
> >
> > Lionel
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: Help: pool not responding

2016-03-02 Thread Mario Giammarco
Tried to set min_size=1 but unfortunately nothing has changed.
Thanks for the idea.

2016-02-29 22:56 GMT+01:00 Lionel Bouton :

> Le 29/02/2016 22:50, Shinobu Kinjo a écrit :
>
> the fact that they are optimized for benchmarks and certainly not
> Ceph OSD usage patterns (with or without internal journal).
>
> Are you assuming that SSHD is causing the issue?
> If you could elaborate on this more, it would be helpful.
>
>
> Probably not (unless they reveal themselves extremely unreliable with Ceph
> OSD usage patterns which would be surprising to me).
>
> For incomplete PG the documentation seems good enough for what should be
> done :
> http://docs.ceph.com/docs/master/rados/operations/pg-states/
>
> The relevant text:
>
> *Incomplete* Ceph detects that a placement group is missing information
> about writes that may have occurred, or does not have any healthy copies.
> If you see this state, try to start any failed OSDs that may contain the
> needed information or temporarily adjust min_size to allow recovery.
>
> We don't have the full history but the most probable cause of these
> incomplete PGs is that min_size is set to 2 or 3 and at some time the 4
> incomplete pgs didn't have as many replica as the min_size value. So if
> setting min_size to 2 isn't enough setting it to 1 should unfreeze them.
>
> Lionel
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help: pool not responding

2016-02-29 Thread Mario Giammarco
Oliver Dzombic  writes:

> 
> Hi,
> 
> i dont know, but as it seems to me:
> 
> incomplete = not enough data
> 
> the only solution would be to drop it ( delete )
> 
> so the cluster get in active healthy state.
> 
> How many copies do you do from each data ?
> 


Do you mean dropping the pg not working or the entire pool?

It is a pool with replication=3 and I had alway at least two osd on.

Is replication=3 not enough?

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help: pool not responding

2016-02-29 Thread Mario Giammarco
Mario Giammarco <mgiammarco@...> writes:

Sorry 
ceph health detail is:


HEALTH_WARN 4 pgs incomplete; 4 pgs stuck inactive; 4 pgs stuck unclean
pg 0.0 is stuck inactive for 4836623.776873, current state incomplete, last
acting [0,1,3]
pg 0.40 is stuck inactive for 2773379.028048, current state incomplete, last
acting [1,0,3]
pg 0.3f is stuck inactive for 4836763.332907, current state incomplete, last
acting [0,3,1]
pg 0.3b is stuck inactive for 4836777.230337, current state incomplete, last
acting [0,3,1]
pg 0.0 is stuck unclean for 4850437.633464, current state incomplete, last
acting [0,1,3]
pg 0.40 is stuck unclean for 4850437.633467, current state incomplete, last
acting [1,0,3]
pg 0.3f is stuck unclean for 4850456.399217, current state incomplete, last
acting [0,3,1]
pg 0.3b is stuck unclean for 4850490.534154, current state incomplete, last
acting [0,3,1]
pg 0.40 is incomplete, acting [1,0,3]
pg 0.3f is incomplete, acting [0,3,1]
pg 0.3b is incomplete, acting [0,3,1]
pg 0.0 is incomplete, acting [0,1,3]



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help: pool not responding

2016-02-29 Thread Mario Giammarco
Thank you for your time.
Dimitar Boichev  writes:

> 
> I am sure that I speak for the majority of people reading this, when I say
that I didn't get anything from your emails.
> Could you provide more debug information ?
> Like (but not limited to):
> ceph -s 
> ceph health details
> ceph osd tree

I asked infact what I need to provide because honestly I do not know.

Here is ceph -s:

cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
 health HEALTH_WARN
4 pgs incomplete
4 pgs stuck inactive
4 pgs stuck unclean
 monmap e8: 3 mons at
{0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0}
election epoch 832, quorum 0,1,2 0,1,2
 osdmap e2400: 3 osds: 3 up, 3 in
  pgmap v5883297: 288 pgs, 4 pools, 391 GB data, 100 kobjects
1090 GB used, 4481 GB / 5571 GB avail
 284 active+clean
   4 incomplete

ceph health detail:

cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
 health HEALTH_WARN
4 pgs incomplete
4 pgs stuck inactive
4 pgs stuck unclean
 monmap e8: 3 mons at
{0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0}
election epoch 832, quorum 0,1,2 0,1,2
 osdmap e2400: 3 osds: 3 up, 3 in
  pgmap v5883297: 288 pgs, 4 pools, 391 GB data, 100 kobjects
1090 GB used, 4481 GB / 5571 GB avail
 284 active+clean
   4 incomplete

ceph osd tree:

ID WEIGHT  TYPE NAME  UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 5.42999 root default 
-2 1.81000 host proxmox-quad3   
 0 1.81000 osd.0   up  1.0  1.0 
-3 1.81000 host proxmox-zotac   
 1 1.81000 osd.1   up  1.0  1.0 
-4 1.81000 host proxmox-hp  
 3 1.81000 osd.3   up  1.0  1.0 


> 
> I am really having a bad time trying to decode the exact problems.
> First you had network issues, then osd failed (in the same time or after?),
> Then the cluser did not have enough free space to recover I suppose  ?
> 
It is a three server/osd test/evaluation system with Ceph and Proxmox PVE.
The load is very light and there is a lot of free space.

So:

- I NEVER had network issues. People TOLD me that I must have network
problems. I changed cables and switches just in case but nothing improved. 
- One disk had bad sectors. So I added another disk/osd and then removed the
osd. Following official documentation. After that the cluster runned ok for
two months. So there was enough free space and the cluster has recovered.
- Then one day I discovered that proxmox backup was hanged and I see that it
was because ceph was not responding.


> Regarding the slow SSD disks, what disks are you using ?

I said SSHD that is a standard hdd with ssd cache. It is 7200rpms but in
benchmarks it is better than a 1rpm disk.

Thanks again,
Mario


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help: pool not responding

2016-02-29 Thread Mario Giammarco
Ferhat Ozkasgarli  writes:


> 1-) One of the OSD nodes has network problem.
> 2-) Disk failure
> 3-) Not enough resource for OSD nodes
> 4-) Slow OSD Disks

I have replaced cables and switches. I am sure that there are no network
problems. Disks are SSHD and so they are fast. Nodes memory is empty. I have
a simple cluster with three nodes just to experiment. One disk brand new has
failed some time ago and so I added a new osd and deleted the old one using
official procedure in documentation.

What can I do now? How can I debug?

Thanks again,
Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help: pool not responding

2016-02-16 Thread Mario Giammarco
Mark Nelson  writes:


> PGs are pool specific, so the other pool may be totally healthy while 
> the first is not.  If it turns out it's a hardware problem, it's also 
> possible that the 2nd pool may not hit all of the same OSDs as the first 
> pool, especially if it has a low PG count.
> 

Just to be clear: I have a cluster with three servers and three osds. The
replica count is three so it is impossible that I am not touching all osds.

How can I tell ceph to discard those pgs?

Thanks again for help,
Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help: pool not responding

2016-02-15 Thread Mario Giammarco
Karan Singh  writes:


> Agreed to Ferhat.
> 
> Recheck your network ( bonds , interfaces , network switches , even cables 
) 

I use gigabit ethernet, I am checking the network.
But I am using another pool on the same cluster and it works perfectly: why?

Thanks again,
Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help: pool not responding

2016-02-15 Thread Mario Giammarco
koukou73gr  writes:

> 
> Have you tried restarting  osd.0 ?
> 
Yes I have restarted all osds many times.
Also launched repair and scrub.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Help: pool not responding

2016-02-14 Thread Mario Giammarco
Hello,
I am using ceph hammer under proxmox. 
I have working cluster it is several month I am using it.
For reasons yet to discover I am now in this situation:

HEALTH_WARN 4 pgs incomplete; 4 pgs stuck inactive; 4 pgs stuck unclean; 7 
requests are blocked > 32 sec; 1 osds have slow requests
pg 0.0 is stuck inactive for 3541712.92, current state incomplete, last 
acting [0,1,3]
pg 0.40 is stuck inactive for 1478467.695684, current state incomplete, 
last acting [1,0,3]
pg 0.3f is stuck inactive for 3541852.000546, current state incomplete, 
last acting [0,3,1]
pg 0.3b is stuck inactive for 3541865.897979, current state incomplete, 
last acting [0,3,1]
pg 0.0 is stuck unclean for 326.301120, current state incomplete, last 
acting [0,1,3]
pg 0.40 is stuck unclean for 326.301128, current state incomplete, last 
acting [1,0,3]
pg 0.3f is stuck unclean for 345.066879, current state incomplete, last 
acting [0,3,1]
pg 0.3b is stuck unclean for 379.201819, current state incomplete, last 
acting [0,3,1]
pg 0.40 is incomplete, acting [1,0,3]
pg 0.3f is incomplete, acting [0,3,1]
pg 0.3b is incomplete, acting [0,3,1]
pg 0.0 is incomplete, acting [0,1,3]
7 ops are blocked > 2097.15 sec
7 ops are blocked > 2097.15 sec on osd.0
1 osds have slow requests


Problem is that when I try to read or write to pool "rbd" (where I have all 
my virtual machines) ceph starts to log "slow reads" and system hungs.
If in the same cluster I create another pool and inside it I create an 
image I can read and write correctly (and fast) so it seems the cluster is 
working and only the pool is not working.

Can you help me?
Thanks,
Mario



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Strange configuration with many SAN and few servers

2014-11-08 Thread Mario Giammarco
Gregory Farnum greg@... writes:

 
 
  and then to replace the server you could hair mount the LUNs somewhere
else and turn on the OSDs. You would need to set a few config options (like
the one that automatically updates crush location on boot), but it shouldn't
be too difficult.

Thank you for your reply!
So basically you confirm that all OSD data stays in the lun and in the
server I must put only a standard ceph install and then I can add again
the osd, am I right?

Thanks again,
Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Strange configuration with many SAN and few servers

2014-11-07 Thread Mario Giammarco
Hello,
I need to build a ceph test lab.
I have to do it with existing hardware.
I have several iscsi and fibre channel san but few servers.
Imagine I have:

- 4 SAN with 1 lun on each san
- 2 diskless (apart boot disk) servers

I mount two luns on first server and two luns on second server.
Then (I suppose) I put 4 ceph osd one on each lun.
Now if a server breaks I lose two osds. But the osd data is not lost because
it is on disk.
My question is: if I replace the server can I use again the osds remounting
the luns on the new server?

Thanks,
Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Armel debian repository

2013-12-22 Thread Mario Giammarco
Wido den Hollander wido@... writes:


 
 What version of ARM CPU is in the Netgear NAS?
 
 Since the packages are build for ARMv7 and for example don't work on a 
 RaspberryPi which is ARMv6.
 
 Another solution would be to build to packages manually for the Netgear NAS.


It is a Marvell Armada 370: arm v7 with hardware float unit (and NX bit
support if it can help)

But there is debian/armel installed.

Thanks,
Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Armel debian repository

2013-12-21 Thread Mario Giammarco
Mario Giammarco mgiammarco@... writes:

 
 Hello,
 I would like to install ceph on a Netgear ReadyNAS 102.
 It is a debian wheezy based.
 I have tried to add ceph repository but nas is armel architecture and I
 see you provide a repo for armhf architecture.
 
 How can I solve this problem?
 
 Thanks,
 Mario
 


Hello again,
noone can help me?
A tutorial? A small hint?
Crosscompiling?
Some armel repository?

Thanks again,
Mario


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Armel debian repository

2013-12-19 Thread Mario Giammarco
Hello,
I would like to install ceph on a Netgear ReadyNAS 102.
It is a debian wheezy based.
I have tried to add ceph repository but nas is armel architecture and I
see you provide a repo for armhf architecture.

How can I solve this problem?

Thanks,
Mario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com