Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-20 Thread Sage Weil
Hi Marc,

On Fri, 10 Nov 2017, Marc Roos wrote:
>  
> osd's are crashing when putting a (8GB) file in a erasure coded pool, 

I take it you adjusted the osd_max_object_size option in your ceph.conf?  
We can "fix" this by enforcing a hard limit on that option, but that 
will just mean you get an error when you try to write the large 
object or offset instead of a crash.

sage



> just before finishing. The same osd's are used for replicated pools 
> rbd/cephfs, and seem to do fine. Did I made some error is this a bug? 
> Looks similar to
> https://www.spinics.net/lists/ceph-devel/msg38685.html
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021045.html
> 
> 
> [@c01 ~]# date ; rados -p ec21 put  $(basename 
> "/mnt/disk/blablablalbalblablalablalb.txt") 
> blablablalbalblablalablalb.txt
> Fri Nov 10 20:27:26 CET 2017
> 
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down
> [Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket 
> closed (con state OPEN)
> [Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket 
> error on write
> [Fri Nov 10 20:33:52 2017] libceph: osd0 down
> [Fri Nov 10 20:33:52 2017] libceph: osd7 down
> [Fri Nov 10 20:33:55 2017] libceph: osd0 down
> [Fri Nov 10 20:33:55 2017] libceph: osd7 down
> [Fri Nov 10 20:34:41 2017] libceph: osd7 up
> [Fri Nov 10 20:34:41 2017] libceph: osd7 up
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up
> [Fri Nov 10 20:35:47 2017] libceph: osd0 up
> [Fri Nov 10 20:35:47 2017] libceph: osd0 up
> 
> [@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt
> 2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-11-10 20:39:31.296290 7f840ad45e40 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-11-10 20:39:31.331588 7f840ad45e40 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00, 
> size 8585740288
> 
> 
> 
> 2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1 
> {"time_micros": 1510342372287484, "job": 32, "event": "flush_started", 
> "num_memtables": 1, "num_entries": 728747, "num_deletes": 363960, 
> "memory_usage": 263854696}
> 2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb: 
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293] 
> [default] [JOB 32] Level-0 flush table #25279: started
> 2017-11-10 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1 
> {"time_micros": 1510342372503293, "cf_name": "default", "job": 32, 
> "event": "table_file_creation", "file_number": 25279, "file_size": 
> 4811948, "table_properties": {"data_size": 4675796, "index_size": 
> 102865, "filter_size": 32302, "raw_key_size": 646440, 
> "raw_average_key_size": 75, "raw_value_size": 4446103, 
> "raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries": 
> 8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter", 
> "kDeletedKeys": "0", "kMergeOperands": "330"}}
> 2017-11-10 20:32:52.503327 7f933028d700  4 rocksdb: 
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319] 
> [default] [JOB 32] Level-0 flush table #25279: 4811948 bytes OK
> 2017-11-10 20:32:52.572413 7f933028d700  4 rocksdb: 
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_files.cc:242] 
> adding log 25276 to recycle list
> 
> 2017-11-10 20:32:52.572422 7f933028d700  4 rocksdb: (Original Log Time 
> 2017/11/10-20:32:52.503339) 
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:360] 
> [default] Level-0 commit table #25279 started
> 2017-11-10 20:32:52.572425 7f933028d700  4 rocksdb: (Original Log Time 
> 2017/11/10-20:32:52.572312) 
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:383] 
> [default] Level-0 commit table #25279: memtable #1 done
> 2017-11-10 20:32:52.572428 7f933028d700  4 rocksdb: (Original Log Time 
> 2017/11/10-20:32:52.572328) EVENT_LOG_v1 {"time_micros": 
> 1510342372572321, "job": 32, "event": "flush_finished", "lsm_state": [4, 
> 4, 36, 140, 0, 0, 0], "immutable_memtables": 0}
> 2017-11-10 

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Christian Wuerdig
I haven't used the rados command line utility but it has an "-o
object_size" option as well as "--striper" to make it use the
libradosstriper library so I'd suggest to give these options a go.

On Mon, Nov 13, 2017 at 9:40 PM, Marc Roos <m.r...@f1-outsourcing.eu> wrote:
>
> 1. I don’t think an osd should 'crash' in such situation.
> 2. How else should I 'rados put' an 8GB file?
>
>
>
>
>
>
> -Original Message-
> From: Christian Wuerdig [mailto:christian.wuer...@gmail.com]
> Sent: maandag 13 november 2017 0:12
> To: Marc Roos
> Cc: ceph-users
> Subject: Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1
>
> As per: https://www.spinics.net/lists/ceph-devel/msg38686.html
> Bluestore as a hard 4GB object size limit
>
>
> On Sat, Nov 11, 2017 at 9:27 AM, Marc Roos <m.r...@f1-outsourcing.eu>
> wrote:
>>
>> osd's are crashing when putting a (8GB) file in a erasure coded pool,
>> just before finishing. The same osd's are used for replicated pools
>> rbd/cephfs, and seem to do fine. Did I made some error is this a bug?
>> Looks similar to
>> https://www.spinics.net/lists/ceph-devel/msg38685.html
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021
>> 045.html
>>
>>
>> [@c01 ~]# date ; rados -p ec21 put  $(basename
>> "/mnt/disk/blablablalbalblablalablalb.txt")
>> blablablalbalblablalablalb.txt
>> Fri Nov 10 20:27:26 CET 2017
>>
>> [Fri Nov 10 20:33:51 2017] libceph: osd9 down [Fri Nov 10 20:33:51
>> 2017] libceph: osd9 down [Fri Nov 10 20:33:51 2017] libceph: osd0
>> 192.168.10.111:6802 socket closed (con state OPEN) [Fri Nov 10
>> 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket error on write
>
>> [Fri Nov 10 20:33:52 2017] libceph: osd0 down [Fri Nov 10 20:33:52
>> 2017] libceph: osd7 down [Fri Nov 10 20:33:55 2017] libceph: osd0 down
>
>> [Fri Nov 10 20:33:55 2017] libceph: osd7 down [Fri Nov 10 20:34:41
>> 2017] libceph: osd7 up [Fri Nov 10 20:34:41 2017] libceph: osd7 up
>> [Fri Nov 10 20:35:03 2017] libceph: osd9 up [Fri Nov 10 20:35:03 2017]
>
>> libceph: osd9 up [Fri Nov 10 20:35:47 2017] libceph: osd0 up [Fri Nov
>> 10 20:35:47 2017] libceph: osd0 up
>>
>> [@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt 2017-11-10
>
>> 20:39:31.296101 7f840ad45e40 -1 WARNING: the following dangerous and
>> experimental features are enabled: bluestore 2017-11-10
>> 20:39:31.296290 7f840ad45e40 -1 WARNING: the following dangerous and
>> experimental features are enabled: bluestore 2017-11-10
>> 20:39:31.331588 7f840ad45e40 -1 WARNING: the following dangerous and
>> experimental features are enabled: bluestore
>> ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00,
>> size 8585740288
>>
>>
>>
>> 2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1
>> {"time_micros": 1510342372287484, "job": 32, "event": "flush_started",
>> "num_memtables": 1, "num_entries": 728747, "num_deletes": 363960,
>> "memory_usage": 263854696}
>> 2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb:
>> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
>> AR
>> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/releas
>> e/ 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293]
>> [default] [JOB 32] Level-0 flush table #25279: started 2017-11-10
>> 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1
>> {"time_micros": 1510342372503293, "cf_name": "default", "job": 32,
>> "event": "table_file_creation", "file_number": 25279, "file_size":
>> 4811948, "table_properties": {"data_size": 4675796, "index_size":
>> 102865, "filter_size": 32302, "raw_key_size": 646440,
>> "raw_average_key_size": 75, "raw_value_size": 4446103,
>> "raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries":
>> 8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
>> "kDeletedKeys": "0", "kMergeOperands": "330"}} 2017-11-10
>> 20:32:52.503327 7f933028d700  4 rocksdb:
>> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
>> AR
>> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/releas
>> e/ 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319]
>> [default] [JOB 32] Level-0 

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Marc Roos
 

I have been asking myself (and here) the same question. I think it is 
because of having this in ceph.conf
enable experimental unrecoverable data corrupting features = bluestore
But I am not sure if I can remove this, or have to replace this with 
something else.

ceph-12.2.1-0.el7.x86_64
ceph-base-12.2.1-0.el7.x86_64
ceph-common-12.2.1-0.el7.x86_64
ceph-mds-12.2.1-0.el7.x86_64
ceph-mgr-12.2.1-0.el7.x86_64
ceph-mon-12.2.1-0.el7.x86_64
ceph-osd-12.2.1-0.el7.x86_64
ceph-selinux-12.2.1-0.el7.x86_64
collectd-ceph-5.7.1-2.el7.x86_64
libcephfs2-12.2.1-0.el7.x86_64
nfs-ganesha-ceph-2.5.2-.el7.x86_64
python-cephfs-12.2.1-0.el7.x86_64




-Original Message-
From: Caspar Smit [mailto:caspars...@supernas.eu] 
Sent: maandag 13 november 2017 9:58
To: ceph-users
Subject: Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

Hi,

Why would Ceph 12.2.1 give you this message:

2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore



Or is that a leftover warning message from an old client?

Kind regards,
Caspar


2017-11-10 21:27 GMT+01:00 Marc Roos <m.r...@f1-outsourcing.eu>:



osd's are crashing when putting a (8GB) file in a erasure coded 
pool,
just before finishing. The same osd's are used for replicated pools
rbd/cephfs, and seem to do fine. Did I made some error is this a 
bug?
Looks similar to
https://www.spinics.net/lists/ceph-devel/msg38685.html 
<https://www.spinics.net/lists/ceph-devel/msg38685.html> 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/
021045.html 
<http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021045.html>
 


[@c01 ~]# date ; rados -p ec21 put  $(basename
"/mnt/disk/blablablalbalblablalablalb.txt")
blablablalbalblablalablalb.txt
Fri Nov 10 20:27:26 CET 2017

[Fri Nov 10 20:33:51 2017] libceph: osd9 down
[Fri Nov 10 20:33:51 2017] libceph: osd9 down
[Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
closed (con state OPEN)
[Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
error on write
[Fri Nov 10 20:33:52 2017] libceph: osd0 down
[Fri Nov 10 20:33:52 2017] libceph: osd7 down
[Fri Nov 10 20:33:55 2017] libceph: osd0 down
[Fri Nov 10 20:33:55 2017] libceph: osd7 down
[Fri Nov 10 20:34:41 2017] libceph: osd7 up
[Fri Nov 10 20:34:41 2017] libceph: osd7 up
[Fri Nov 10 20:35:03 2017] libceph: osd9 up
[Fri Nov 10 20:35:03 2017] libceph: osd9 up
[Fri Nov 10 20:35:47 2017] libceph: osd0 up
[Fri Nov 10 20:35:47 2017] libceph: osd0 up

[@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt
2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore
2017-11-10 20:39:31.296290 7f840ad45e40 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore
2017-11-10 20:39:31.331588 7f840ad45e40 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore
ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 
20:32:52.00,
size 8585740288



2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1510342372287484, "job": 32, "event": 
"flush_started",
"num_memtables": 1, "num_entries": 728747, "num_deletes": 363960,
"memory_usage": 263854696}
2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb:
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILAB
LE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/rel
ease/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293]
[default] [JOB 32] Level-0 flush table #25279: started
2017-11-10 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1510342372503293, "cf_name": "default", "job": 32,
"event": "table_file_creation", "file_number": 25279, "file_size":
4811948, "table_properties": {"data_size": 4675796, "index_size":
102865, "filter_size": 32302, "raw_key_size": 646440,
"raw_average_key_size": 75, "raw_value_size": 4446103,
"raw_average_value_size": 519, "num_data_blocks": 1180, 
"num_entries":
8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
"kDeletedKeys": "0", "kMergeOperands": "33

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Caspar Smit
Hi,

Why would Ceph 12.2.1 give you this message:

2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore

Or is that a leftover warning message from an old client?

Kind regards,
Caspar

2017-11-10 21:27 GMT+01:00 Marc Roos :

>
> osd's are crashing when putting a (8GB) file in a erasure coded pool,
> just before finishing. The same osd's are used for replicated pools
> rbd/cephfs, and seem to do fine. Did I made some error is this a bug?
> Looks similar to
> https://www.spinics.net/lists/ceph-devel/msg38685.html
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/
> 2017-September/021045.html
>
>
> [@c01 ~]# date ; rados -p ec21 put  $(basename
> "/mnt/disk/blablablalbalblablalablalb.txt")
> blablablalbalblablalablalb.txt
> Fri Nov 10 20:27:26 CET 2017
>
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down
> [Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
> closed (con state OPEN)
> [Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
> error on write
> [Fri Nov 10 20:33:52 2017] libceph: osd0 down
> [Fri Nov 10 20:33:52 2017] libceph: osd7 down
> [Fri Nov 10 20:33:55 2017] libceph: osd0 down
> [Fri Nov 10 20:33:55 2017] libceph: osd7 down
> [Fri Nov 10 20:34:41 2017] libceph: osd7 up
> [Fri Nov 10 20:34:41 2017] libceph: osd7 up
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up
> [Fri Nov 10 20:35:47 2017] libceph: osd0 up
> [Fri Nov 10 20:35:47 2017] libceph: osd0 up
>
> [@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt
> 2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> 2017-11-10 20:39:31.296290 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> 2017-11-10 20:39:31.331588 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00,
> size 8585740288
>
>
>
> 2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372287484, "job": 32, "event": "flush_started",
> "num_memtables": 1, "num_entries": 728747, "num_deletes": 363960,
> "memory_usage": 263854696}
> 2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293]
> [default] [JOB 32] Level-0 flush table #25279: started
> 2017-11-10 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372503293, "cf_name": "default", "job": 32,
> "event": "table_file_creation", "file_number": 25279, "file_size":
> 4811948, "table_properties": {"data_size": 4675796, "index_size":
> 102865, "filter_size": 32302, "raw_key_size": 646440,
> "raw_average_key_size": 75, "raw_value_size": 4446103,
> "raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries":
> 8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
> "kDeletedKeys": "0", "kMergeOperands": "330"}}
> 2017-11-10 20:32:52.503327 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319]
> [default] [JOB 32] Level-0 flush table #25279: 4811948 bytes OK
> 2017-11-10 20:32:52.572413 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_files.cc:242]
> adding log 25276 to recycle list
>
> 2017-11-10 20:32:52.572422 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.503339)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:360]
> [default] Level-0 commit table #25279 started
> 2017-11-10 20:32:52.572425 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.572312)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:383]
> [default] Level-0 commit table #25279: memtable #1 done
> 2017-11-10 20:32:52.572428 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.572328) EVENT_LOG_v1 {"time_micros":
> 1510342372572321, "job": 32, "event": "flush_finished", "lsm_state": [4,
> 4, 36, 140, 0, 0, 0], "immutable_memtables": 0}
> 2017-11-10 20:32:52.572430 7f933028d700 

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Marc Roos
 
1. I don’t think an osd should 'crash' in such situation. 
2. How else should I 'rados put' an 8GB file?






-Original Message-
From: Christian Wuerdig [mailto:christian.wuer...@gmail.com] 
Sent: maandag 13 november 2017 0:12
To: Marc Roos
Cc: ceph-users
Subject: Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

As per: https://www.spinics.net/lists/ceph-devel/msg38686.html
Bluestore as a hard 4GB object size limit


On Sat, Nov 11, 2017 at 9:27 AM, Marc Roos <m.r...@f1-outsourcing.eu> 
wrote:
>
> osd's are crashing when putting a (8GB) file in a erasure coded pool, 
> just before finishing. The same osd's are used for replicated pools 
> rbd/cephfs, and seem to do fine. Did I made some error is this a bug?
> Looks similar to
> https://www.spinics.net/lists/ceph-devel/msg38685.html
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021
> 045.html
>
>
> [@c01 ~]# date ; rados -p ec21 put  $(basename
> "/mnt/disk/blablablalbalblablalablalb.txt")
> blablablalbalblablalablalb.txt
> Fri Nov 10 20:27:26 CET 2017
>
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down [Fri Nov 10 20:33:51 
> 2017] libceph: osd9 down [Fri Nov 10 20:33:51 2017] libceph: osd0 
> 192.168.10.111:6802 socket closed (con state OPEN) [Fri Nov 10 
> 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket error on write 

> [Fri Nov 10 20:33:52 2017] libceph: osd0 down [Fri Nov 10 20:33:52 
> 2017] libceph: osd7 down [Fri Nov 10 20:33:55 2017] libceph: osd0 down 

> [Fri Nov 10 20:33:55 2017] libceph: osd7 down [Fri Nov 10 20:34:41 
> 2017] libceph: osd7 up [Fri Nov 10 20:34:41 2017] libceph: osd7 up 
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up [Fri Nov 10 20:35:03 2017] 

> libceph: osd9 up [Fri Nov 10 20:35:47 2017] libceph: osd0 up [Fri Nov 
> 10 20:35:47 2017] libceph: osd0 up
>
> [@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt 2017-11-10 

> 20:39:31.296101 7f840ad45e40 -1 WARNING: the following dangerous and 
> experimental features are enabled: bluestore 2017-11-10 
> 20:39:31.296290 7f840ad45e40 -1 WARNING: the following dangerous and 
> experimental features are enabled: bluestore 2017-11-10 
> 20:39:31.331588 7f840ad45e40 -1 WARNING: the following dangerous and 
> experimental features are enabled: bluestore 
> ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00, 
> size 8585740288
>
>
>
> 2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372287484, "job": 32, "event": "flush_started",
> "num_memtables": 1, "num_entries": 728747, "num_deletes": 363960,
> "memory_usage": 263854696}
> 2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
> AR 
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/releas
> e/ 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293]
> [default] [JOB 32] Level-0 flush table #25279: started 2017-11-10 
> 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372503293, "cf_name": "default", "job": 32,
> "event": "table_file_creation", "file_number": 25279, "file_size":
> 4811948, "table_properties": {"data_size": 4675796, "index_size":
> 102865, "filter_size": 32302, "raw_key_size": 646440,
> "raw_average_key_size": 75, "raw_value_size": 4446103,
> "raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries":
> 8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
> "kDeletedKeys": "0", "kMergeOperands": "330"}} 2017-11-10 
> 20:32:52.503327 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
> AR 
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/releas
> e/ 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319]
> [default] [JOB 32] Level-0 flush table #25279: 4811948 bytes OK 
> 2017-11-10 20:32:52.572413 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
> AR 
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/releas
> e/ 
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_files.cc:242]
> adding log 25276 to recycle list
>
> 2017-11-10 20:32:52.572422 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.503339)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
> AR 
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-12 Thread Christian Wuerdig
As per: https://www.spinics.net/lists/ceph-devel/msg38686.html
Bluestore as a hard 4GB object size limit


On Sat, Nov 11, 2017 at 9:27 AM, Marc Roos  wrote:
>
> osd's are crashing when putting a (8GB) file in a erasure coded pool,
> just before finishing. The same osd's are used for replicated pools
> rbd/cephfs, and seem to do fine. Did I made some error is this a bug?
> Looks similar to
> https://www.spinics.net/lists/ceph-devel/msg38685.html
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021045.html
>
>
> [@c01 ~]# date ; rados -p ec21 put  $(basename
> "/mnt/disk/blablablalbalblablalablalb.txt")
> blablablalbalblablalablalb.txt
> Fri Nov 10 20:27:26 CET 2017
>
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down
> [Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
> closed (con state OPEN)
> [Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
> error on write
> [Fri Nov 10 20:33:52 2017] libceph: osd0 down
> [Fri Nov 10 20:33:52 2017] libceph: osd7 down
> [Fri Nov 10 20:33:55 2017] libceph: osd0 down
> [Fri Nov 10 20:33:55 2017] libceph: osd7 down
> [Fri Nov 10 20:34:41 2017] libceph: osd7 up
> [Fri Nov 10 20:34:41 2017] libceph: osd7 up
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up
> [Fri Nov 10 20:35:47 2017] libceph: osd0 up
> [Fri Nov 10 20:35:47 2017] libceph: osd0 up
>
> [@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt
> 2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> 2017-11-10 20:39:31.296290 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> 2017-11-10 20:39:31.331588 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00,
> size 8585740288
>
>
>
> 2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372287484, "job": 32, "event": "flush_started",
> "num_memtables": 1, "num_entries": 728747, "num_deletes": 363960,
> "memory_usage": 263854696}
> 2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293]
> [default] [JOB 32] Level-0 flush table #25279: started
> 2017-11-10 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372503293, "cf_name": "default", "job": 32,
> "event": "table_file_creation", "file_number": 25279, "file_size":
> 4811948, "table_properties": {"data_size": 4675796, "index_size":
> 102865, "filter_size": 32302, "raw_key_size": 646440,
> "raw_average_key_size": 75, "raw_value_size": 4446103,
> "raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries":
> 8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
> "kDeletedKeys": "0", "kMergeOperands": "330"}}
> 2017-11-10 20:32:52.503327 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319]
> [default] [JOB 32] Level-0 flush table #25279: 4811948 bytes OK
> 2017-11-10 20:32:52.572413 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_files.cc:242]
> adding log 25276 to recycle list
>
> 2017-11-10 20:32:52.572422 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.503339)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:360]
> [default] Level-0 commit table #25279 started
> 2017-11-10 20:32:52.572425 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.572312)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:383]
> [default] Level-0 commit table #25279: memtable #1 done
> 2017-11-10 20:32:52.572428 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.572328) EVENT_LOG_v1 {"time_micros":
> 1510342372572321, "job": 32, "event": "flush_finished", "lsm_state": [4,
> 4, 36, 140, 0, 0, 0], "immutable_memtables": 0}
> 2017-11-10 20:32:52.572430 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.572397)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> 

[ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-10 Thread Marc Roos
 
osd's are crashing when putting a (8GB) file in a erasure coded pool, 
just before finishing. The same osd's are used for replicated pools 
rbd/cephfs, and seem to do fine. Did I made some error is this a bug? 
Looks similar to
https://www.spinics.net/lists/ceph-devel/msg38685.html
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021045.html


[@c01 ~]# date ; rados -p ec21 put  $(basename 
"/mnt/disk/blablablalbalblablalablalb.txt") 
blablablalbalblablalablalb.txt
Fri Nov 10 20:27:26 CET 2017

[Fri Nov 10 20:33:51 2017] libceph: osd9 down
[Fri Nov 10 20:33:51 2017] libceph: osd9 down
[Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket 
closed (con state OPEN)
[Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket 
error on write
[Fri Nov 10 20:33:52 2017] libceph: osd0 down
[Fri Nov 10 20:33:52 2017] libceph: osd7 down
[Fri Nov 10 20:33:55 2017] libceph: osd0 down
[Fri Nov 10 20:33:55 2017] libceph: osd7 down
[Fri Nov 10 20:34:41 2017] libceph: osd7 up
[Fri Nov 10 20:34:41 2017] libceph: osd7 up
[Fri Nov 10 20:35:03 2017] libceph: osd9 up
[Fri Nov 10 20:35:03 2017] libceph: osd9 up
[Fri Nov 10 20:35:47 2017] libceph: osd0 up
[Fri Nov 10 20:35:47 2017] libceph: osd0 up

[@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt
2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-11-10 20:39:31.296290 7f840ad45e40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-11-10 20:39:31.331588 7f840ad45e40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00, 
size 8585740288



2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1 
{"time_micros": 1510342372287484, "job": 32, "event": "flush_started", 
"num_memtables": 1, "num_entries": 728747, "num_deletes": 363960, 
"memory_usage": 263854696}
2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293] 
[default] [JOB 32] Level-0 flush table #25279: started
2017-11-10 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1 
{"time_micros": 1510342372503293, "cf_name": "default", "job": 32, 
"event": "table_file_creation", "file_number": 25279, "file_size": 
4811948, "table_properties": {"data_size": 4675796, "index_size": 
102865, "filter_size": 32302, "raw_key_size": 646440, 
"raw_average_key_size": 75, "raw_value_size": 4446103, 
"raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries": 
8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter", 
"kDeletedKeys": "0", "kMergeOperands": "330"}}
2017-11-10 20:32:52.503327 7f933028d700  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319] 
[default] [JOB 32] Level-0 flush table #25279: 4811948 bytes OK
2017-11-10 20:32:52.572413 7f933028d700  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_files.cc:242] 
adding log 25276 to recycle list

2017-11-10 20:32:52.572422 7f933028d700  4 rocksdb: (Original Log Time 
2017/11/10-20:32:52.503339) 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:360] 
[default] Level-0 commit table #25279 started
2017-11-10 20:32:52.572425 7f933028d700  4 rocksdb: (Original Log Time 
2017/11/10-20:32:52.572312) 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:383] 
[default] Level-0 commit table #25279: memtable #1 done
2017-11-10 20:32:52.572428 7f933028d700  4 rocksdb: (Original Log Time 
2017/11/10-20:32:52.572328) EVENT_LOG_v1 {"time_micros": 
1510342372572321, "job": 32, "event": "flush_finished", "lsm_state": [4, 
4, 36, 140, 0, 0, 0], "immutable_memtables": 0}
2017-11-10 20:32:52.572430 7f933028d700  4 rocksdb: (Original Log Time 
2017/11/10-20:32:52.572397) 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_compaction_flush
.cc:132] [default] Level summary: base level 1 max bytes base 268435456 
files[4 4 36 140 0 0 0] max score 1.00

2017-11-10 20:32:52.572491 7f933028d700  4 rocksdb: