date:20150409

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread p...@philw.com

Rather expensive option: 

Applied Micro X-Gene, overkill for a single disk, and only really available in a
development kit format right now.
 
https://www.apm.com/products/data-center/x-gene-family/x-c1-development-kits/
 
Better Option: 

Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks, and one
node with mSATA SSD
 
http://www.ambedded.com.tw/pt_list.php?CM_ID=20140214001

--phil

 On 09 April 2015 at 15:57 Quentin Hartman qhart...@direwolfdigital.com
 wrote:
 
  I'm skeptical about how well this would work, but a Banana Pi might be a
 place to start. Like a raspberry pi, but it has a SATA connector:
 http://www.bananapi.org/
 
  On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg jer...@update.uu.se
 mailto:jer...@update.uu.se  wrote:
 Hello ceph users,
  
 Is anyone running any low powered single disk nodes with Ceph now?
  Calxeda seems to be no more according to Wikipedia. I do not think HP
  moonshot is what I am looking for - I want stand-alone nodes, not server
  cartridges integrated into server chassis. And I do not want to be locked to
  a single vendor.
  
 I was playing with Raspberry Pi 2 for signage when I thought of my old
  experiments with Ceph.
  
 I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or maybe
  something with a low-power Intel x64/x86 processor. Together with one SSD or
  one low power HDD the node could get all power via PoE (via splitter or
  integrated into board if such boards exist). PoE provide remote power-on
  power-off even for consumer grade nodes.
  
 The cost for a single low power node should be able to compete with
  traditional PC-servers price per disk. Ceph take care of redundancy.
  
 I think simple custom casing should be good enough - maybe just strap or
  velcro everything on trays in the rack, at least for the nodes with SSD.
  
 Kind regards,
 --
 Jerker Nyberg, Uppsala, Sweden.
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Mark Nelson

Notice that this is under their emerging technologies section.  I don't 
think you can buy them yet.  Hopefully we'll know more as time goes on. :)


Mark


On 04/09/2015 10:52 AM, Stillwell, Bryan wrote:

These are really interesting to me, but how can you buy them?  What's the
performance like in ceph?  Are they using the keyvaluestore backend, or
something specific to these drives?  Also what kind of chassis do they go
into (some kind of ethernet JBOD)?

Bryan

On 4/9/15, 9:43 AM, Mark Nelson mnel...@redhat.com wrote:


How about drives that run Linux with an ARM processor, RAM, and an
ethernet port right on the drive?  Notice the Ceph logo. :)

https://www.hgst.com/science-of-storage/emerging-technologies/open-etherne
t-drive-architecture

Mark

On 04/09/2015 10:37 AM, Scott Laird wrote:

Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB)
Ethernet port.


On Thu, Apr 9, 2015, 8:03 AM p...@philw.com mailto:p...@philw.com
p...@philw.com mailto:p...@philw.com wrote:

 Rather expensive option:

 Applied Micro X-Gene, overkill for a single disk, and only really
 available in a
 development kit format right now.


https://www.apm.com/products/__data-center/x-gene-family/x-__c1-developm
ent-kits/

https://www.apm.com/products/data-center/x-gene-family/x-c1-development-
kits/

 Better Option:

 Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks,
 and one
 node with mSATA SSD

 http://www.ambedded.com.tw/__pt_list.php?CM_ID=20140214001
 http://www.ambedded.com.tw/pt_list.php?CM_ID=20140214001

 --phil

   On 09 April 2015 at 15:57 Quentin Hartman
 qhart...@direwolfdigital.com mailto:qhart...@direwolfdigital.com
   wrote:
  
I'm skeptical about how well this would work, but a Banana Pi
 might be a
   place to start. Like a raspberry pi, but it has a SATA connector:
   http://www.bananapi.org/
  
On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg
 jer...@update.uu.se mailto:jer...@update.uu.se
   mailto:jer...@update.uu.se mailto:jer...@update.uu.se 
wrote:
   Hello ceph users,
   
   Is anyone running any low powered single disk nodes with
 Ceph now?
Calxeda seems to be no more according to Wikipedia. I do not
 think HP
moonshot is what I am looking for - I want stand-alone nodes,
 not server
cartridges integrated into server chassis. And I do not want to
 be locked to
a single vendor.
   
   I was playing with Raspberry Pi 2 for signage when I thought
 of my old
experiments with Ceph.
   
   I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or
 maybe
something with a low-power Intel x64/x86 processor. Together
 with one SSD or
one low power HDD the node could get all power via PoE (via
 splitter or
integrated into board if such boards exist). PoE provide remote
 power-on
power-off even for consumer grade nodes.
   
   The cost for a single low power node should be able to
 compete with
traditional PC-servers price per disk. Ceph take care of
 redundancy.
   
   I think simple custom casing should be good enough - maybe
 just strap or
velcro everything on trays in the rack, at least for the nodes
 with SSD.
   
   Kind regards,
   --
   Jerker Nyberg, Uppsala, Sweden.
   _
   ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 mailto:ceph-us...@lists.ceph.__com
mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
   _
ceph-users mailing list
   ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
   http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  


 _
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



This E-mail and any of its attachments may contain Time Warner Cable 
proprietary information, which is privileged, confidential, or

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Mark Nelson

How about drives that run Linux with an ARM processor, RAM, and an 
ethernet port right on the drive?  Notice the Ceph logo. :)


https://www.hgst.com/science-of-storage/emerging-technologies/open-ethernet-drive-architecture

Mark

On 04/09/2015 10:37 AM, Scott Laird wrote:

Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB)
Ethernet port.


On Thu, Apr 9, 2015, 8:03 AM p...@philw.com mailto:p...@philw.com
p...@philw.com mailto:p...@philw.com wrote:

Rather expensive option:

Applied Micro X-Gene, overkill for a single disk, and only really
available in a
development kit format right now.


https://www.apm.com/products/__data-center/x-gene-family/x-__c1-development-kits/

https://www.apm.com/products/data-center/x-gene-family/x-c1-development-kits/

Better Option:

Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks,
and one
node with mSATA SSD

http://www.ambedded.com.tw/__pt_list.php?CM_ID=20140214001
http://www.ambedded.com.tw/pt_list.php?CM_ID=20140214001

--phil

  On 09 April 2015 at 15:57 Quentin Hartman
qhart...@direwolfdigital.com mailto:qhart...@direwolfdigital.com
  wrote:
 
   I'm skeptical about how well this would work, but a Banana Pi
might be a
  place to start. Like a raspberry pi, but it has a SATA connector:
  http://www.bananapi.org/
 
   On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg
jer...@update.uu.se mailto:jer...@update.uu.se
  mailto:jer...@update.uu.se mailto:jer...@update.uu.se  wrote:
  Hello ceph users,
  
  Is anyone running any low powered single disk nodes with
Ceph now?
   Calxeda seems to be no more according to Wikipedia. I do not
think HP
   moonshot is what I am looking for - I want stand-alone nodes,
not server
   cartridges integrated into server chassis. And I do not want to
be locked to
   a single vendor.
  
  I was playing with Raspberry Pi 2 for signage when I thought
of my old
   experiments with Ceph.
  
  I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or
maybe
   something with a low-power Intel x64/x86 processor. Together
with one SSD or
   one low power HDD the node could get all power via PoE (via
splitter or
   integrated into board if such boards exist). PoE provide remote
power-on
   power-off even for consumer grade nodes.
  
  The cost for a single low power node should be able to
compete with
   traditional PC-servers price per disk. Ceph take care of
redundancy.
  
  I think simple custom casing should be good enough - maybe
just strap or
   velcro everything on trays in the rack, at least for the nodes
with SSD.
  
  Kind regards,
  --
  Jerker Nyberg, Uppsala, Sweden.
  _
  ceph-users mailing list
   ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
mailto:ceph-us...@lists.ceph.__com mailto:ceph-users@lists.ceph.com
   http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
   http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  _
   ceph-users mailing list
  ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
  http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


_
ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] long blocking with writes on rbds

2015-04-09 Thread Ilya Dryomov

On Wed, Apr 8, 2015 at 7:36 PM, Lionel Bouton lionel+c...@bouton.name wrote:
 On 04/08/15 18:24, Jeff Epstein wrote:
 Hi, I'm having sporadic very poor performance running ceph. Right now
 mkfs, even with nodiscard, takes 30 mintes or more. These kind of
 delays happen often but irregularly .There seems to be no common
 denominator. Clearly, however, they make it impossible to deploy ceph
 in production.

 I reported this problem earlier on ceph's IRC, and was told to add
 nodiscard to mkfs. That didn't help. Here is the command that I'm
 using to format an rbd:

 For example: mkfs.ext4 -text4 -m0 -b4096 -E nodiscard /dev/rbd1

 I probably won't be able to help much, but people knowing more will need
 at least:
 - your Ceph version,
 - the kernel version of the host on which you are trying to format
 /dev/rbd1,
 - which hardware and network you are using for this cluster (CPU, RAM,
 HDD or SSD models, network cards, jumbo frames, ...).


 Ceph says everything is okay:

 cluster e96e10d3-ad2b-467f-9fe4-ab5269b70206
  health HEALTH_OK
  monmap e1: 3 mons at
 {a=192.168.224.4:6789/0,b=192.168.232.4:6789/0,c=192.168.240.4:6789/0}, 
 election
 epoch 12, quorum 0,1,2 a,b,c
  osdmap e972: 6 osds: 6 up, 6 in
   pgmap v4821: 4400 pgs, 44 pools, 5157 MB data, 1654 objects
 46138 MB used, 1459 GB / 1504 GB avail
 4400 active+clean

Are there any slow request warnings in the logs?

Assuming a 30 minute mkfs is somewhat reproducible, can you bump osd
and ms log levels and try to capture it?

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Motherboard recommendation?

2015-04-09 Thread Markus Goldberg


Hi,
i have a backup-storage with ceph 0,93
As every backup-system it is only been written and hopefully never read.

The hardware is 3 Supermicro SC847-cases with 30 SATA-HDDS each (2- and 
4-TB-WD-disks) = 250TB
I have realized, that the motherboards and CPUs are totally undersized, 
so i want to install new boards.

I'm thinking of the following:
3 Supermicro X10DRH-CT or X10DRC-T4+ with 128GB memory each.
What do you think about these boards? Will they fit into the SC847?
They have SAS and 10G-Base-T onboard, so no extra controller seems to be 
necessary.

What Xeon-v3 should i take, how many cores?
Does anyone know if M.2-SSDs are supported in their pci-e-slots?

Thank you very much,
  Markus

--
Markus Goldberg   Universität Hildesheim
  Rechenzentrum
Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
--

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Motherboard recommendation?

2015-04-09 Thread Markus Goldberg


Hi Mohamed,
thank you for your reply.
I thougt, there is a SAS-Expander on the backplanes of the SC847, so all 
drives can be run. Am i wrong?


thanks,
  Markus
Am 09.04.2015 um 10:24 schrieb Mohamed Pakkeer:

Hi Markus,

X10DRH-CT can support only 16 drive as default. If you want to connect 
more drive,there is a special SKU for more drive support from super 
micro or you need additional SAS controller. We are using 2630 V3( 8 
core - 2.4GHz) *2 for 30 drives on SM X10DRI-T. It is working 
perfectly on replication based cluster. If you are planning to use 
erasure coding, you have to think about higher spec.


Does any one know about the exact processor requirement of 30 drives 
node for erasure coding? . I can't find suitable hardware 
recommendation for erasure coding.


Cheers
K.Mohamed Pakkeer




On Thu, Apr 9, 2015 at 1:30 PM, Markus Goldberg 
goldb...@uni-hildesheim.de mailto:goldb...@uni-hildesheim.de wrote:


Hi,
i have a backup-storage with ceph 0,93
As every backup-system it is only been written and hopefully never
read.

The hardware is 3 Supermicro SC847-cases with 30 SATA-HDDS each
(2- and 4-TB-WD-disks) = 250TB
I have realized, that the motherboards and CPUs are totally
undersized, so i want to install new boards.
I'm thinking of the following:
3 Supermicro X10DRH-CT or X10DRC-T4+ with 128GB memory each.
What do you think about these boards? Will they fit into the SC847?
They have SAS and 10G-Base-T onboard, so no extra controller seems
to be necessary.
What Xeon-v3 should i take, how many cores?
Does anyone know if M.2-SSDs are supported in their pci-e-slots?

Thank you very much,
  Markus

--
Markus Goldberg   Universität Hildesheim
  Rechenzentrum
Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
mailto:goldb...@uni-hildesheim.de
--

___
ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Thanks  Regards
K.Mohamed Pakkeer
Mobile- 0091-8754410114




--
MfG,
  Markus Goldberg

--
Markus Goldberg   Universität Hildesheim
  Rechenzentrum
Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
--

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD Hardware recommendation

2015-04-09 Thread f...@univ-lr.fr


Hi all,

just an update - but an important one - of the previous benchmark with 2 
new 10 DWPD class contenders :

- Seagate 1200 -  ST200FM0053 - SAS 12Gb/s
- Intel DC S3700 - SATA 6Gb/s

The graph :
   
http://www.4shared.com/download/yaeJgJiFce/Perf-SSDs-Toshiba-Seagate-Inte.png?lgfp=3000 



It speaks by itself, the Seagate is clearly a massive improvement over 
our best SSD so far (Toshiba M2).
That's a 430MB/s write bandwidth reached with blocks as small as 4KB, 
written with SYNC and DIRECT flags.
This was somewhat expected after reading this review 
http://www.tweaktown.com/reviews/6075/seagate-1200-stx00fm-12gb-s-sas-enterprise-ssd-review/index.html
An impressive result that should make the Seagate as a SSD of choice for 
journal on hosts with SAS controllers


I had also access to an Intel DC S3700, an unavoidable reference as Ceph 
journal. Indeed not bad on 4k blocks for the price.


The benchs were made on Dell R730xd with H730P SAS controller (LSI 3108 
12GB/s SAS)


Frederic

f...@univ-lr.fr f...@univ-lr.fr a écrit le 31/03/15 14:09 :

Hi,

in our quest to get the right SSD for OSD journals, I managed to 
benchmark two kind of 10 DWPD SSDs :

- Toshiba M2 PX02SMF020
- Samsung 845DC PRO

I wan't to determine if a disk is appropriate considering its absolute 
performances, and the optimal number of ceph-osd processes using the 
SSD as a journal.
The benchmark consists of a fio command, with SYNC and DIRECT access 
options, and 4k blocks write accesses.


fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k 
--runtime=60 --time_based --group_reporting --name=journal-test 
--iodepth=1 or 16 --numjobs= ranging from 1 to 16


I think numjobs can represent the concurrent number of OSD served by 
this SSD. Am I right on this ?



http://www.4shared.com/download/WOvooKVXce/Fio-Direct-Sync-ToshibaM2-Sams.png?lgfp=3000


My understanding of that data is that the 845DC Pro cannot be used for 
more that 4 OSD.

The M2 is very constant in its comportment.
The iodepth has almost no impact on perfs here.

Could someone having other SSD types make the same test to consolidate 
the data ?


Among the short list that could be considered for that task (for their 
price/perfs/DWPD/...) :

- Seagate 1200 SSD 200GB, SAS 12Gb/s ST200FM0053
- Hitachi SSD800MM MLC HUSMM8020ASS200
- Intel DC3700

I've not yet considered write amplification mentionned in other posts.

Frederic

Josef Johansson jose...@gmail.com a écrit le 20/03/15 10:29 :



The 845DC Pro does look really nice, comparable with s3700 with TDW even.
The price is what really does it, as it’s almost a third compared with s3700..

  




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] protocol feature mismatch after upgrading to Hammer

2015-04-09 Thread Kyle Hutson

I upgraded from giant to hammer yesterday and now 'ceph -w' is constantly
repeating this message:

2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 
10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
c=0x7f95e0023670).connect protocol feature mismatch, my 3fff  peer
13fff missing 1

It isn't always the same IP for the destination - here's another:
2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 
10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
c=0x7f95e002b480).connect protocol feature mismatch, my 3fff  peer
13fff missing 1

Some details about our install:
We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in an
erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used for a
caching tier in front of the EC pool. All 24 hosts are monitors. 4 hosts
are mds. We are running cephfs with a client trying to write data over
cephfs when we're seeing these messages.

Any ideas?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS unmatched rstat after upgrade hammer

2015-04-09 Thread Scottix

Alright sounds good.

Only one comment then:
From an IT/ops perspective all I see is ERR and that raises red flags. So
the exposure of the message might need some tweaking. In production I like
to be notified of an issue but have reassurance it was fixed within the
system.

Best Regards

On Wed, Apr 8, 2015 at 8:10 PM Yan, Zheng uker...@gmail.com wrote:

 On Thu, Apr 9, 2015 at 7:09 AM, Scottix scot...@gmail.com wrote:
  I was testing the upgrade on our dev environment and after I restarted
 the
  mds I got the following errors.
 
  2015-04-08 15:58:34.056470 mds.0 [ERR] unmatched rstat on 605, inode has
  n(v70 rc2015-03-16 09:11:34.390905), dirfrags have n(v0 rc2015-03-16
  09:11:34.390905 1=0+1)
  2015-04-08 15:58:34.056530 mds.0 [ERR] unmatched rstat on 604, inode has
  n(v69 rc2015-03-31 08:07:09.265241), dirfrags have n(v0 rc2015-03-31
  08:07:09.265241 1=0+1)
  2015-04-08 15:58:34.056581 mds.0 [ERR] unmatched rstat on 606, inode has
  n(v67 rc2015-03-16 08:54:36.314790), dirfrags have n(v0 rc2015-03-16
  08:54:36.314790 1=0+1)
  2015-04-08 15:58:34.056633 mds.0 [ERR] unmatched rstat on 607, inode has
  n(v57 rc2015-03-16 08:54:46.797240), dirfrags have n(v0 rc2015-03-16
  08:54:46.797240 1=0+1)
  2015-04-08 15:58:34.056687 mds.0 [ERR] unmatched rstat on 608, inode has
  n(v23 rc2015-03-16 08:54:59.634299), dirfrags have n(v0 rc2015-03-16
  08:54:59.634299 1=0+1)
  2015-04-08 15:58:34.056737 mds.0 [ERR] unmatched rstat on 609, inode has
  n(v62 rc2015-03-16 08:55:06.598286), dirfrags have n(v0 rc2015-03-16
  08:55:06.598286 1=0+1)
  2015-04-08 15:58:34.056789 mds.0 [ERR] unmatched rstat on 600, inode has
  n(v101 rc2015-03-16 08:55:16.153175), dirfrags have n(v0 rc2015-03-16
  08:55:16.153175 1=0+1)

 These errors are likely caused by the bug that rstats are not set to
 correct values
 when creating new fs. Nothing to worry about, the MDS automatically fixes
 rstat
 errors.

 
  I am not sure if this is an issue or got fixed or something I should
 worry
  about. But would just like some context around this issue since it came
 up
  in the ceph -w and other users might see it as well.
 
  I have done a lot of unsafe stuff on this mds so not to freak anyone
 out
  if that is the issue.
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Scott Laird

Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB) Ethernet
port.

On Thu, Apr 9, 2015, 8:03 AM p...@philw.com p...@philw.com wrote:

 Rather expensive option:

 Applied Micro X-Gene, overkill for a single disk, and only really
 available in a
 development kit format right now.

 https://www.apm.com/products/data-center/x-gene-family/x-
 c1-development-kits/

 Better Option:

 Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks, and
 one
 node with mSATA SSD

 http://www.ambedded.com.tw/pt_list.php?CM_ID=20140214001

 --phil

  On 09 April 2015 at 15:57 Quentin Hartman qhart...@direwolfdigital.com
  wrote:
 
   I'm skeptical about how well this would work, but a Banana Pi might be a
  place to start. Like a raspberry pi, but it has a SATA connector:
  http://www.bananapi.org/
 
   On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg jer...@update.uu.se
  mailto:jer...@update.uu.se  wrote:
  Hello ceph users,
  
  Is anyone running any low powered single disk nodes with Ceph now?
   Calxeda seems to be no more according to Wikipedia. I do not think HP
   moonshot is what I am looking for - I want stand-alone nodes, not
 server
   cartridges integrated into server chassis. And I do not want to be
 locked to
   a single vendor.
  
  I was playing with Raspberry Pi 2 for signage when I thought of my
 old
   experiments with Ceph.
  
  I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or maybe
   something with a low-power Intel x64/x86 processor. Together with one
 SSD or
   one low power HDD the node could get all power via PoE (via splitter or
   integrated into board if such boards exist). PoE provide remote
 power-on
   power-off even for consumer grade nodes.
  
  The cost for a single low power node should be able to compete with
   traditional PC-servers price per disk. Ceph take care of redundancy.
  
  I think simple custom casing should be good enough - maybe just
 strap or
   velcro everything on trays in the rack, at least for the nodes with
 SSD.
  
  Kind regards,
  --
  Jerker Nyberg, Uppsala, Sweden.
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] protocol feature mismatch after upgrading to Hammer

2015-04-09 Thread Gregory Farnum

Can you dump your crush map and post it on pastebin or something?

On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson kylehut...@ksu.edu wrote:
 Nope - it's 64-bit.

 (Sorry, I missed the reply-all last time.)

 On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum g...@gregs42.com wrote:

 [Re-added the list]

 Hmm, I'm checking the code and that shouldn't be possible. What's your
 ciient? (In particular, is it 32-bit? That's the only thing i can
 think of that might have slipped through our QA.)

 On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson kylehut...@ksu.edu wrote:
  I did nothing to enable anything else. Just changed my ceph repo from
  'giant' to 'hammer', then did 'yum update' and restarted services.
 
  On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum g...@gregs42.com wrote:
 
  Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
  cluster unless you made changes to the layout requiring it.
 
  If you did, the clients have to be upgraded to understand it. You
  could disable all the v4 features; that should let them connect again.
  -Greg
 
  On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson kylehut...@ksu.edu wrote:
   This particular problem I just figured out myself ('ceph -w' was
   still
   running from before the upgrade, and ctrl-c and restarting solved
   that
   issue), but I'm still having a similar problem on the ceph client:
  
   libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca 
   server's 102b84a042aca, missing 1
  
   It appears that even the latest kernel doesn't have support for
   CEPH_FEATURE_CRUSH_V4
  
   How do I make my ceph cluster backward-compatible with the old cephfs
   client?
  
   On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson kylehut...@ksu.edu
   wrote:
  
   I upgraded from giant to hammer yesterday and now 'ceph -w' is
   constantly
   repeating this message:
  
   2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 
   10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
   c=0x7f95e0023670).connect protocol feature mismatch, my 3fff
   
   peer
   13fff missing 1
  
   It isn't always the same IP for the destination - here's another:
   2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 
   10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
   c=0x7f95e002b480).connect protocol feature mismatch, my 3fff
   
   peer
   13fff missing 1
  
   Some details about our install:
   We have 24 hosts with 18 OSDs each. 16 per host are spinning disks
   in
   an
   erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions
   used
   for a
   caching tier in front of the EC pool. All 24 hosts are monitors. 4
   hosts are
   mds. We are running cephfs with a client trying to write data over
   cephfs
   when we're seeing these messages.
  
   Any ideas?
  
  
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
 
 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Motherboard recommendation?

2015-04-09 Thread Mohamed Pakkeer

Hi Markus,

I think,if you connect more than 16 drives on back plane,X10DRH-CT will
detect and show only 16 drives in BIOS. I am not sure about that. If you
test this motherboard, please let me know the result.

Msg form supermicro site

LSI 3108 SAS3 (12Gbps) controller;

   - 2GB cache; HW RAID 0, 1, 5, 6, 10, 50, 60
   - Supports up to 16 devices as default, more HDD devices support is also
   available as an option *

   For special SKU, please contact your Supermicro Sales.


Thanks
K.Mohamed Pakkeer

On Thu, Apr 9, 2015 at 5:05 PM, Markus Goldberg goldb...@uni-hildesheim.de
wrote:

  Hi Mohamed,
 thank you for your reply.
 I thougt, there is a SAS-Expander on the backplanes of the SC847, so all
 drives can be run. Am i wrong?

 thanks,
   Markus

 Am 09.04.2015 um 10:24 schrieb Mohamed Pakkeer:

  Hi Markus,

  X10DRH-CT can support only 16 drive as default. If you want to connect
 more drive,there is a special SKU for more drive support from super
 micro or you need additional SAS controller. We are using 2630 V3( 8 core -
 2.4GHz) *2 for 30 drives on SM X10DRI-T. It is working perfectly on
 replication based cluster. If you are planning to use erasure coding, you
 have to think about higher spec.

  Does any one know about the exact processor requirement of 30 drives
 node for erasure coding? . I can't find suitable hardware recommendation
 for erasure coding.

  Cheers
 K.Mohamed Pakkeer




 On Thu, Apr 9, 2015 at 1:30 PM, Markus Goldberg 
 goldb...@uni-hildesheim.de wrote:

 Hi,
 i have a backup-storage with ceph 0,93
 As every backup-system it is only been written and hopefully never read.

 The hardware is 3 Supermicro SC847-cases with 30 SATA-HDDS each (2- and
 4-TB-WD-disks) = 250TB
 I have realized, that the motherboards and CPUs are totally undersized,
 so i want to install new boards.
 I'm thinking of the following:
 3 Supermicro X10DRH-CT or X10DRC-T4+ with 128GB memory each.
 What do you think about these boards? Will they fit into the SC847?
 They have SAS and 10G-Base-T onboard, so no extra controller seems to be
 necessary.
 What Xeon-v3 should i take, how many cores?
 Does anyone know if M.2-SSDs are supported in their pci-e-slots?

 Thank you very much,
   Markus

 --
 Markus Goldberg   Universität Hildesheim
   Rechenzentrum
 Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
 Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
 --

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




  --
 Thanks  Regards
 K.Mohamed Pakkeer
 Mobile- 0091-8754410114



 --
 MfG,
   Markus Goldberg

 --
 Markus Goldberg   Universität Hildesheim
   Rechenzentrum
 Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
 Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
 --




-- 
Thanks  Regards
K.Mohamed Pakkeer
Mobile- 0091-8754410114
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Recovering incomplete PGs with ceph_objectstore_tool

2015-04-09 Thread Paul Evans

Congrats Chris and nice save on that RBD!

--
Paul 

 On Apr 9, 2015, at 11:11 AM, Chris Kitzmiller ckitzmil...@hampshire.edu 
 wrote:
 
 Success! Hopefully my notes from the process will help:
 
 In the event of multiple disk failures the cluster could lose PGs. Should 
 this occur it is best to attempt to restart the OSD process and have the 
 drive marked as up+out. Marking the drive as out will cause data to flow off 
 the drive to elsewhere in the cluster. In the event that the ceph-osd process 
 is unable to keep running you could try using the ceph_objectstore_tool 
 program to extract just the damaged PGs and import them into working PGs.
 
 Fixing Journals
 In this particular scenario things were complicated by the fact that 
 ceph_objectstore_tool came out in Giant but we were running Firefly. Not 
 wanting to upgrade the cluster in a degraded state this required that the OSD 
 drives be moved to a different physical machine for repair. This added a lot 
 of steps related to the journals but it wasn't a big deal. That process looks 
 like:
 
 On Storage1:
 stop ceph-osd id=15
 ceph-osd -i 15 --flush-journal
 ls -l /var/lib/ceph/osd/ceph-15/journal
 
 Note the journal device UUID then pull the disk and move it to Ithome:
 rm /var/lib/ceph/osd/ceph-15/journal
 ceph-osd -i 15 --mkjournal
 
 That creates a colocated journal for which to use during the 
 ceph_objectstore_tool commands. Once done then:
 ceph-osd -i 15 --flush-journal
 rm /var/lib/ceph/osd/ceph-15/journal
 
 Pull the disk and bring it back to Storage1. Then:
 ln -s /dev/disk/by-partitionuuid/b4f8d911-5ac9-4bf0-a06a-b8492e25a00f 
 /var/lib/ceph/osd/ceph-15/journal
 ceph-osd -i 15 --mkjournal
 start ceph-osd id=15
 
 This all won't be needed once the cluster is running Hammer because then 
 there will be an available version of ceph_objectstore_tool on the local 
 machine and you can keep the journals throughout the process.
 
 
 Recovery Process
 We were missing two PGs, 3.c7 and 3.102. These PGs were hosted on OSD.0 and 
 OSD.15 which were the two disks which failed out of Storage1. The disk for 
 OSD.0 seemed to be a total loss while the disk for OSD.15 was somewhat more 
 cooperative but not in a place to be up and running in the cluster. I took 
 the dying OSD.15 drive and placed it into a new physical machine with a fresh 
 install of Ceph Giant. Using Giant's ceph_objectstore_tool I was able to 
 extract the PGs with a command like:
 for i in 3.c7 3.102 ; do ceph_objectstore_tool --data 
 /var/lib/ceph/osd/ceph-15 --journal /var/lib/ceph/osd/ceph-15/journal --op 
 export --pgid $i --file ~/${i}.export
 
 Once both PGs were successfully exported I attempted to import them into a 
 new temporary OSD following instructions from here. For some reason that 
 didn't work. The OSD was up+in but wasn't backfilling the PGs into the 
 cluster. If you find yourself in this process I would try that first just in 
 case it provides a cleaner process.
 Considering the above didn't work and we were looking at the possibility of 
 losing the RBD volume (or perhaps worse, the potential of fruitlessly fscking 
 35TB) I took what I might describe as heroic measures:
 
 Running
 ceph pg dump | grep incomplete
 
 3.c7   0  0  0  0  0  0  0  incomplete  2015-04-02  20:49:32.968841  0'0  
 15730:17  [15,0]  15  [15,0]  15  13985'54076  2015-03-31  19:14:22.721695  
 13985'54076  2015-03-31  19:14:22.721695
 3.102  0  0  0  0  0  0  0  incomplete  2015-04-02  20:49:32.529594  0'0  
 15730:21  [0,15]  0   [0,15]  0   13985'53107  2015-03-29  21:17:15.568125  
 13985'49195  2015-03-24  18:38:08.244769
 
 Then I stopped all OSDs, which blocked all I/O to the cluster, with:
 stop ceph-osd-all
 
 Then I looked for all copies of the PG on all OSDs with:
 for i in 3.c7 3.102 ; do find /var/lib/ceph/osd/ -maxdepth 3 -type d -name 
 $i ; done | sort -V
 
 /var/lib/ceph/osd/ceph-0/current/3.c7_head
 /var/lib/ceph/osd/ceph-0/current/3.102_head
 /var/lib/ceph/osd/ceph-3/current/3.c7_head
 /var/lib/ceph/osd/ceph-13/current/3.102_head
 /var/lib/ceph/osd/ceph-15/current/3.c7_head
 /var/lib/ceph/osd/ceph-15/current/3.102_head
 
 Then I flushed the journals for all of those OSDs with:
 for i in 0 3 13 15 ; do ceph-osd -i $i --flush-journal ; done
 
 Then I removed all of those drives and moved them (using Journal Fixing 
 above) to Ithome where I used ceph_objectstore_tool to remove all traces of 
 3.102 and 3.c7:
 for i in 0 3 13 15 ; do for j in 3.c7 3.102 ; do ceph_objectstore_tool --data 
 /var/lib/ceph/osd/ceph-$i --journal /var/lib/ceph/osd/ceph-$i/journal --op 
 remove --pgid $j ; done ; done
 
 Then I imported the PGs onto OSD.0 and OSD.15 with:
 for i in 0 15 ; do for j in 3.c7 3.102 ; do ceph_objectstore_tool --data 
 /var/lib/ceph/osd/ceph-$i --journal /var/lib/ceph/osd/ceph-$i/journal --op 
 import --file ~/${j}.export ; done ; done
 for i in 0 15 ; do ceph-osd -i $i --flush-journal  rm 
 /var/log/ceph/osd/ceph-$i/journal ; done
 
 Then I moved the disks back

Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Patrik Plank

Hi,



ceph version 0.87.1 

thanks

best regards



-Original message-
From: Chu Duc Minh chu.ducm...@gmail.com
Sent: Thursday 9th April 2015 15:03
To: Patrik Plank pat...@plank.me
Cc: ceph-users@lists.ceph.com  ceph-users@lists.ceph.com 
ceph-users@lists.ceph.com
Subject: Re: [ceph-users] cache-tier do not evict

What ceph version do you use?

Regards,

On 9 Apr 2015 18:58, Patrik Plank pat...@plank.me mailto:pat...@plank.me  
wrote:
Hi,



i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm pool.

these are my settings :



ceph osd tier add kvm cache-pool

ceph osd tier cache-mode cache-pool writeback

ceph osd tier set-overlay kvm cache-pool



ceph osd pool set cache-pool hit_set_type bloom

ceph osd pool set cache-pool hit_set_count 1

ceph osd pool set cache-pool hit set period 3600



ceph osd pool set cache-pool target_max_bytes 751619276800

ceph osd pool set cache-pool target_max_objects 100



ceph osd pool set cache-pool cache_min_flush_age 1800

ceph osd pool set cache-pool cache_min_evict_age 600



ceph osd pool cache-pool cache_target_dirty_ratio .4

ceph osd pool cache-pool cache target_full_ratio .8



So the problem is, the cache-tier do no evict automatically.

If i copy some kvm images to the ceph cluster, the cache osds always run full.



Is that normal?

Is there a miss configuration?



thanks

best regards

Patrik












___
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Jacob Reid

On Wed, Apr 08, 2015 at 03:42:29PM +, Gregory Farnum wrote:
 Im on my phone so can't check exactly what those threads are trying to do,
 but the osd has several threads which are stuck. The FileStore threads are
 certainly trying to access the disk/local filesystem. You may not have a
 hardware fault, but it looks like something in your stack is not behaving
 when the osd asks the filesystem to do something. Check dmesg, etc.
 -Greg


Noticed a bit in dmesg that seems to be controller-related (HP Smart Array 
P420i) where I/O was hanging in some cases[1]; fixed by updating from 5.42 to 
6.00

[1] http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c03555882

In dmesg:
[11775.779477] hpsa :08:00.0: ABORT REQUEST on C1:B0:T0:L0 
Tag:0x:0010 Command:0x2a SN:0x49fb  REQUEST SUCCEEDED.
[11812.170350] hpsa :08:00.0: Abort request on C1:B0:T0:L0
[11817.386773] hpsa :08:00.0: cp 880522bff000 is reported invalid 
(probably means target device no longer present)
[11817.386784] hpsa :08:00.0: ABORT REQUEST on C1:B0:T0:L0 
Tag:0x:0010 Command:0x2a SN:0x4a13  REQUEST SUCCEEDED.

The problem still appears to be persisting in the cluster, although I am no 
longer seeing the disk-related errors in dmesg, I am still getting errors in 
the osd logs:

2015-04-08 17:24:15.024820 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
2015-04-08 17:24:15.025043 7f0f2169e700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
2015-04-08 17:48:33.146399 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
2015-04-08 17:48:33.146439 7f0f2169e700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
2015-04-08 18:55:31.107727 7f0f16740700  1 heartbeat_map reset_timeout 
'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
2015-04-08 18:55:31.107774 7f0f2169e700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
2015-04-08 18:55:31.107789 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
2015-04-08 18:55:31.108225 7f0f29eaf700  1 heartbeat_map is_healthy 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 18:55:31.108268 7f0f15f3f700  1 heartbeat_map reset_timeout 
'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
2015-04-08 18:55:31.108272 7f0f29eaf700  1 heartbeat_map is_healthy 'OSD::op_tp 
thread 0x7f0f17742700' had timed out after 4
2015-04-08 18:55:31.108281 7f0f29eaf700  1 heartbeat_map is_healthy 'OSD::op_tp 
thread 0x7f0f16f41700' had timed out after 4
2015-04-08 18:55:31.108285 7f0f1573e700  1 heartbeat_map reset_timeout 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 18:55:31.108345 7f0f16f41700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
2015-04-08 18:55:31.108378 7f0f17742700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
2015-04-08 19:01:20.694897 7f0f15f3f700  1 heartbeat_map reset_timeout 
'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
2015-04-08 19:01:20.694928 7f0f17742700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
2015-04-08 19:01:20.694970 7f0f16f41700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
2015-04-08 19:01:20.695544 7f0f1573e700  1 heartbeat_map reset_timeout 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 19:01:20.695665 7f0f16740700  1 heartbeat_map reset_timeout 
'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
2015-04-08 19:01:34.979288 7f0f1573e700  1 heartbeat_map reset_timeout 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 19:01:34.979498 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
2015-04-08 19:01:34.979513 7f0f16f41700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
2015-04-08 19:01:34.979535 7f0f2169e700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
2015-04-08 19:01:34.980021 7f0f15f3f700  1 heartbeat_map reset_timeout 
'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
2015-04-08 19:01:34.980051 7f0f17742700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
2015-04-08 19:01:34.980392 7f0f16740700  1 heartbeat_map reset_timeout 
'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
2015-04-08 19:03:34.731872 7f0f1573e700  1 heartbeat_map reset_timeout 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 19:03:34.731972 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700'

Re: [ceph-users] long blocking with writes on rbds

2015-04-09 Thread Christian Balzer

On Thu, 09 Apr 2015 00:25:08 -0400 Jeff Epstein wrote:

Running Ceph on AWS is, as was mentioned before, certainly not going to
improve things when compared to real HW.
At the very least it will make performance unpredictable.

Your 6 OSDs are on a single VM from what I gather?
Aside from being a very small number for something that you seem to be
using in some sort of production environment (Ceph gets faster the more
OSDs you add), where is the redundancy, HA in that?

The number of your PGs and PGPs need to have at least a semblance of being
correctly sized, as others mentioned before.
You want to re-read the Ceph docs about that and check out the PG
calculator:
http://ceph.com/pgcalc/

 
  Our workload involves creating and destroying a lot of pools. Each
  pool has 100 pgs, so it adds up. Could this be causing the problem?
  What would you suggest instead?
 
  ...this is most likely the cause. Deleting a pool causes the data and
  pgs associated with it to be deleted asynchronously, which can be a lot
  of background work for the osds.
 
  If you're using the cfq scheduler you can try decreasing the priority 
  of these operations with the osd disk thread ioprio... options:
 
  http://ceph.com/docs/master/rados/configuration/osd-config-ref/#operations 
 
 
  If that doesn't help enough, deleting data from pools before deleting
  the pools might help, since you can control the rate more finely. And
  of course not creating/deleting so many pools would eliminate the
  hidden background cost of deleting the pools.
 
 Thanks for your answer. Some follow-up questions:
 
 - I wouldn't expect that pool deletion is the problem, since our pools, 
 although many, don't contain much data. Typically, we will have one rbd 
 per pool, several GB in size, but in practice containing little data. 
 Would you expect that performance penalty from deleting pool to be 
 relative to the requested size of the rbd, or relative to the quantity 
 of data actually stored in it?
 
Since RBDs are sparsely allocated, the actual data used is the key factor.
But you're adding the pool removal overhead to this.

 - Rather than creating and deleting multiple pools, each containing a 
 single rbd, do you think we would see a speed-up if we were to instead 
 have one pool, containing multiple (frequently created and deleted) 
 rbds? Does the performance penalty stem only from deleting pools 
 themselves, or from deleting objects within the pool as well?
 
Both and the fact that you have overloaded the PGs by nearly a factor of
10 (or 20 if you're actually using a replica of 3 and not 1)doesn't help
one bit.
And lets clarify what objects are in the Ceph/RBD context, they're the (by
default) 4MB blobs that make up a RBD image.

 - Somewhat off-topic, but for my own curiosity: Why is deleting data so 
 slow, in terms of ceph's architecture? Shouldn't it just be a matter of 
 flagging a region as available and allowing it to be overwritten, as 
 would a traditional file system?
 
Apples and oranges, as RBD is block storage, not a FS.
That said, a traditional FS is local and updates an inode or equivalent
bit.
For Ceph to delete a RBD image, it has to go to all cluster nodes with
OSDs that have PGs that contain objects of that image. Then those objects
have to be deleted on the local filesystem of the OSD and various maps
updated cluster wide. Rince and repeat until all objects have been dealt
with.
Quite a bit more involved, but that's the price you have to pay when you
have a DISTRIBUTED storage architecture that doesn't rely on a single item
(like an inode) to reflect things for the whole system.


Christian

 Jeff
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] long blocking with writes on rbds

2015-04-09 Thread Jeff Epstein




On 04/09/2015 03:14 AM, Christian Balzer wrote:


Your 6 OSDs are on a single VM from what I gather?
Aside from being a very small number for something that you seem to be
using in some sort of production environment (Ceph gets faster the more
OSDs you add), where is the redundancy, HA in that?


We are running one OSD per VM. All data is replicated across three VMs.


The number of your PGs and PGPs need to have at least a semblance of being
correctly sized, as others mentioned before.
You want to re-read the Ceph docs about that and check out the PG
calculator:
http://ceph.com/pgcalc/


My choice of pgs is based on this page. Since each pool is spread across 
3 OSDs, 100 seemed like a good number. Am I misinterpreting this 
documentation?

http://ceph.com/docs/master/rados/operations/placement-groups/


Since RBDs are sparsely allocated, the actual data used is the key factor.
But you're adding the pool removal overhead to this.

How much overhead does pool removal add?

Both and the fact that you have overloaded the PGs by nearly a factor of
10 (or 20 if you're actually using a replica of 3 and not 1)doesn't help
one bit.
And lets clarify what objects are in the Ceph/RBD context, they're the (by
default) 4MB blobs that make up a RBD image.


I'm curious how you reached your estimation of overloading. According to 
the pg calculator you linked to, given that each pool occupies only 3 
OSDs, the suggested number of pgs is around 100. Can you explain?

- Somewhat off-topic, but for my own curiosity: Why is deleting data so
slow, in terms of ceph's architecture? Shouldn't it just be a matter of
flagging a region as available and allowing it to be overwritten, as
would a traditional file system?


Apples and oranges, as RBD is block storage, not a FS.
That said, a traditional FS is local and updates an inode or equivalent
bit.
For Ceph to delete a RBD image, it has to go to all cluster nodes with
OSDs that have PGs that contain objects of that image. Then those objects
have to be deleted on the local filesystem of the OSD and various maps
updated cluster wide. Rince and repeat until all objects have been dealt
with.
Quite a bit more involved, but that's the price you have to pay when you
have a DISTRIBUTED storage architecture that doesn't rely on a single item
(like an inode) to reflect things for the whole system.

Thank you for explaining.

Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Hammer : Ceph-deploy 1.5.23-0 : RGW civetweb :: Not getting installed

2015-04-09 Thread Iain Geddes

Hi Vickey,

The keyring gets created as part of the initial deployment so it should be
on your admin node right alongside the admin keyring etc. FWIW, I tried
this quickly yesterday and it failed because the RGW directory didn't exist
on the node that I was attempting to deploy to ... but I didn't actually
look that deeply into it as it's not critical for what I wanted to complete
today. The keyring was definately there following a successful deployment
though.

Kind regards


Iain

On Thu, Apr 9, 2015 at 7:41 PM, Vickey Singh vickey.singh22...@gmail.com
wrote:

 Hello Cephers

 I am trying to setup RGW using Ceph-deploy which is described here


 http://docs.ceph.com/docs/master/start/quick-ceph-deploy/#add-an-rgw-instance


 But unfortunately it doesn't seems to be working

 Is there something i am missing  or you know some fix for this.




 [root@ceph-node1 yum.repos.d]# ceph -v

 *ceph version 0.94* (e61c4f093f88e44961d157f65091733580cea79a)

 [root@ceph-node1 yum.repos.d]#



 # yum update ceph-deploy


  SKIPPED 



   Verifying  : ceph-deploy-1.5.22-0.noarch
 2/2


 Updated:

  * ceph-deploy.noarch 0:1.5.23-0*


 Complete!

 [root@ceph-node1 ceph]#





 [root@ceph-node1 ceph]# ceph-deploy rgw create rgw-node1

 [ceph_deploy.conf][DEBUG ] found configuration file at:
 /root/.cephdeploy.conf

 [ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy rgw
 create rgw-node1

 [ceph_deploy.rgw][DEBUG ] Deploying rgw, cluster ceph hosts
 rgw-node1:rgw.rgw-node1

 *[ceph_deploy][ERROR ] RuntimeError: bootstrap-rgw keyring not found; run
 'gatherkeys'*




 [root@ceph-node1 ceph]# ceph-deploy --overwrite-conf mon create-initial

 [ceph_deploy.conf][DEBUG ] found configuration file at:
 /root/.cephdeploy.conf

 [ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy
 --overwrite-conf mon create-initial

  SKIPPED 

 [ceph_deploy.mon][INFO  ] mon.ceph-node1 monitor has reached quorum!

 [ceph_deploy.mon][INFO  ] all initial monitors are running and have formed
 quorum

 [ceph_deploy.mon][INFO  ] Running gatherkeys...

 [ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring

 [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring

 [ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-osd.keyring

 [ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-mds.keyring

 [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-node1 for
 /var/lib/ceph/bootstrap-rgw/ceph.keyring

 [ceph-node1][DEBUG ] connected to host: ceph-node1

 [ceph-node1][DEBUG ] detect platform information from remote host

 [ceph-node1][DEBUG ] detect machine type

 [ceph-node1][DEBUG ] fetch remote file

 *[ceph_deploy.gatherkeys][WARNIN] Unable to find
 /var/lib/ceph/bootstrap-rgw/ceph.keyring on ceph-node1*

 *[ceph_deploy.gatherkeys][WARNIN] No RGW bootstrap key found. Will not be
 able to deploy RGW daemons*

 [root@ceph-node1 ceph]#



 [root@ceph-node1 ceph]# ceph-deploy gatherkeys ceph-node1

 [ceph_deploy.conf][DEBUG ] found configuration file at:
 /root/.cephdeploy.conf

 [ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy
 gatherkeys ceph-node1

 [ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring

 [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring

 [ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-osd.keyring

 [ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-mds.keyring

 [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-node1 for
 /var/lib/ceph/bootstrap-rgw/ceph.keyring

 [ceph-node1][DEBUG ] connected to host: ceph-node1

 [ceph-node1][DEBUG ] detect platform information from remote host

 [ceph-node1][DEBUG ] detect machine type

 [ceph-node1][DEBUG ] fetch remote file

 *[ceph_deploy.gatherkeys][WARNIN] Unable to find
 /var/lib/ceph/bootstrap-rgw/ceph.keyring on ceph-node1*

 *[ceph_deploy.gatherkeys][WARNIN] No RGW bootstrap key found. Will not be
 able to deploy RGW daemons*

 [root@ceph-node1 ceph]#



 Regards

 VS

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Iain Geddes
Application Engineer[image: Cyan] http://cyaninc.com/1383 North McDowell
Blvd.
Petaluma, CA 94954M+353 89 432
6811eiain.ged...@cyaninc.comwww.cyaninc.com[image:
Facebook] http://www.facebook.com/CyanInc [image: LinkedIn]
http://www.linkedin.com/company/cyan-inc?trk=hb_tab_compy_id_2171992 [image:
Twitter] http://twitter.com/CyanNews
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cascading Failure of OSDs

2015-04-09 Thread Carl-Johan Schenström

Francois Lafont wrote:

 Just in case it could be useful, I have noticed the -s option (on my
 Ubuntu) that offer an output probably easier to parse:
 
 # column -t is just to make it's nice for the human eyes.
 ifconfig -s | column -t

Since ifconfig is deprecated, one should use iproute2 instead.

ip -s link show p2p1 | awk '/(RX|TX):/{getline; print $3;}'

However, the sysfs interface is probably a better alternative. See 
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-class-net-statistics
 and https://www.kernel.org/doc/Documentation/ABI/README.

-- 
Carl-Johan Schenström
Driftansvarig / System Administrator
Språkbanken  Svensk nationell datatjänst /
The Swedish Language Bank  Swedish National Data Service
Göteborgs universitet / University of Gothenburg
carl-johan.schenst...@gu.se / +46 709 116769
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] cache-tier do not evict

2015-04-09 Thread Patrik Plank

Hi,



i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm pool.

these are my settings :



ceph osd tier add kvm cache-pool

ceph osd tier cache-mode cache-pool writeback

ceph osd tier set-overlay kvm cache-pool



ceph osd pool set cache-pool hit_set_type bloom

ceph osd pool set cache-pool hit_set_count 1

ceph osd pool set cache-pool hit set period 3600



ceph osd pool set cache-pool target_max_bytes 751619276800

ceph osd pool set cache-pool target_max_objects 100



ceph osd pool set cache-pool cache_min_flush_age 1800

ceph osd pool set cache-pool cache_min_evict_age 600



ceph osd pool cache-pool cache_target_dirty_ratio .4

ceph osd pool cache-pool cache target_full_ratio .8



So the problem is, the cache-tier do no evict automatically.

If i copy some kvm images to the ceph cluster, the cache osds always run full.



Is that normal?

Is there a miss configuration?



thanks

best regards

Patrik











___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] protocol feature mismatch after upgrading to Hammer

2015-04-09 Thread Gregory Farnum

[Re-added the list]

Hmm, I'm checking the code and that shouldn't be possible. What's your
ciient? (In particular, is it 32-bit? That's the only thing i can
think of that might have slipped through our QA.)

On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson kylehut...@ksu.edu wrote:
 I did nothing to enable anything else. Just changed my ceph repo from
 'giant' to 'hammer', then did 'yum update' and restarted services.

 On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum g...@gregs42.com wrote:

 Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
 cluster unless you made changes to the layout requiring it.

 If you did, the clients have to be upgraded to understand it. You
 could disable all the v4 features; that should let them connect again.
 -Greg

 On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson kylehut...@ksu.edu wrote:
  This particular problem I just figured out myself ('ceph -w' was still
  running from before the upgrade, and ctrl-c and restarting solved that
  issue), but I'm still having a similar problem on the ceph client:
 
  libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca 
  server's 102b84a042aca, missing 1
 
  It appears that even the latest kernel doesn't have support for
  CEPH_FEATURE_CRUSH_V4
 
  How do I make my ceph cluster backward-compatible with the old cephfs
  client?
 
  On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson kylehut...@ksu.edu wrote:
 
  I upgraded from giant to hammer yesterday and now 'ceph -w' is
  constantly
  repeating this message:
 
  2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 
  10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
  c=0x7f95e0023670).connect protocol feature mismatch, my 3fff 
  peer
  13fff missing 1
 
  It isn't always the same IP for the destination - here's another:
  2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 
  10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
  c=0x7f95e002b480).connect protocol feature mismatch, my 3fff 
  peer
  13fff missing 1
 
  Some details about our install:
  We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in
  an
  erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used
  for a
  caching tier in front of the EC pool. All 24 hosts are monitors. 4
  hosts are
  mds. We are running cephfs with a client trying to write data over
  cephfs
  when we're seeing these messages.
 
  Any ideas?
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Patrik Plank

Hi,



set the cache-tier size to 644245094400.

This should work.

But it is the same.



thanks

regards



-Original message-
From: Gregory Farnum g...@gregs42.com
Sent: Thursday 9th April 2015 15:44
To: Patrik Plank pat...@plank.me
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] cache-tier do not evict


On Thu, Apr 9, 2015 at 4:56 AM, Patrik Plank pat...@plank.me wrote:
 Hi,


 i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm
 pool.

 these are my settings :


 ceph osd tier add kvm cache-pool

 ceph osd tier cache-mode cache-pool writeback

 ceph osd tier set-overlay kvm cache-pool


 ceph osd pool set cache-pool hit_set_type bloom

 ceph osd pool set cache-pool hit_set_count 1

 ceph osd pool set cache-pool hit set period 3600


 ceph osd pool set cache-pool target_max_bytes 751619276800

 ˆ 750 GB. For 3*512GB disks that's too large a target value.


 ceph osd pool set cache-pool target_max_objects 100


 ceph osd pool set cache-pool cache_min_flush_age 1800

 ceph osd pool set cache-pool cache_min_evict_age 600


 ceph osd pool cache-pool cache_target_dirty_ratio .4

 ceph osd pool cache-pool cache target_full_ratio .8


 So the problem is, the cache-tier do no evict automatically.

 If i copy some kvm images to the ceph cluster, the cache osds always run
 full.


 Is that normal?

 Is there a miss configuration?


 thanks

 best regards

 Patrik







 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] protocol feature mismatch after upgrading to Hammer

2015-04-09 Thread Kyle Hutson

http://people.beocat.cis.ksu.edu/~kylehutson/crushmap

On Thu, Apr 9, 2015 at 11:25 AM, Gregory Farnum g...@gregs42.com wrote:

 Hmmm. That does look right and neither I nor Sage can come up with
 anything via code inspection. Can you post the actual binary crush map
 somewhere for download so that we can inspect it with our tools?
 -Greg

 On Thu, Apr 9, 2015 at 7:57 AM, Kyle Hutson kylehut...@ksu.edu wrote:
  Here 'tis:
  https://dpaste.de/POr1
 
 
  On Thu, Apr 9, 2015 at 9:49 AM, Gregory Farnum g...@gregs42.com wrote:
 
  Can you dump your crush map and post it on pastebin or something?
 
  On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson kylehut...@ksu.edu wrote:
   Nope - it's 64-bit.
  
   (Sorry, I missed the reply-all last time.)
  
   On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum g...@gregs42.com
 wrote:
  
   [Re-added the list]
  
   Hmm, I'm checking the code and that shouldn't be possible. What's
 your
   ciient? (In particular, is it 32-bit? That's the only thing i can
   think of that might have slipped through our QA.)
  
   On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson kylehut...@ksu.edu
 wrote:
I did nothing to enable anything else. Just changed my ceph repo
 from
'giant' to 'hammer', then did 'yum update' and restarted services.
   
On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum g...@gregs42.com
wrote:
   
Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by
the
cluster unless you made changes to the layout requiring it.
   
If you did, the clients have to be upgraded to understand it. You
could disable all the v4 features; that should let them connect
again.
-Greg
   
On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson kylehut...@ksu.edu
wrote:
 This particular problem I just figured out myself ('ceph -w' was
 still
 running from before the upgrade, and ctrl-c and restarting
 solved
 that
 issue), but I'm still having a similar problem on the ceph
 client:

 libceph: mon19 10.5.38.20:6789 feature set mismatch, my
 2b84a042aca 
 server's 102b84a042aca, missing 1

 It appears that even the latest kernel doesn't have support for
 CEPH_FEATURE_CRUSH_V4

 How do I make my ceph cluster backward-compatible with the old
 cephfs
 client?

 On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson kylehut...@ksu.edu
 
 wrote:

 I upgraded from giant to hammer yesterday and now 'ceph -w' is
 constantly
 repeating this message:

 2015-04-09 08:50:26.318042 7f95dbf86700  0 --
 10.5.38.1:0/2037478
 
 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0
 cs=0
 l=1
 c=0x7f95e0023670).connect protocol feature mismatch, my
 3fff
 
 peer
 13fff missing 1

 It isn't always the same IP for the destination - here's
 another:
 2015-04-09 08:50:20.322059 7f95dc087700  0 --
 10.5.38.1:0/2037478
 
 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0
 cs=0
 l=1
 c=0x7f95e002b480).connect protocol feature mismatch, my
 3fff
 
 peer
 13fff missing 1

 Some details about our install:
 We have 24 hosts with 18 OSDs each. 16 per host are spinning
 disks
 in
 an
 erasure coded pool (k=8 m=4). 2 OSDs per host are SSD
 partitions
 used
 for a
 caching tier in front of the EC pool. All 24 hosts are
 monitors.
 4
 hosts are
 mds. We are running cephfs with a client trying to write data
 over
 cephfs
 when we're seeing these messages.

 Any ideas?



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

   
   
  
  
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Jacob Reid

On Thu, Apr 09, 2015 at 08:46:07AM -0700, Gregory Farnum wrote:
 On Thu, Apr 9, 2015 at 8:14 AM, Jacob Reid lists-c...@jacob-reid.co.uk 
 wrote:
  On Thu, Apr 09, 2015 at 06:43:45AM -0700, Gregory Farnum wrote:
  You can turn up debugging (debug osd = 10 and debug filestore = 10
  are probably enough, or maybe 20 each) and see what comes out to get
  more information about why the threads are stuck.
 
  But just from the log my answer is the same as before, and now I don't
  trust that controller (or maybe its disks), regardless of what it's
  admitting to. ;)
  -Greg
 
 
  Ran with osd and filestore debug both at 20; still nothing jumping out at 
  me. Logfile attached as it got huge fairly quickly, but mostly seems to be 
  the same extra lines. I tried running some test I/O on the drives in 
  question to try and provoke some kind of problem, but they seem fine now...
 
 Okay, this is strange. Something very wonky is happening with your
 scheduler — it looks like these threads are all idle, and they're
 scheduling wakeups that handle an appreciable amount of time after
 they're supposed to. For instance:
 2015-04-09 15:56:55.953116 7f70a7963700 20
 filestore(/var/lib/ceph/osd/osd.15) sync_entry woke after 5.416704
 2015-04-09 15:56:55.953153 7f70a7963700 20
 filestore(/var/lib/ceph/osd/osd.15) sync_entry waiting for
 max_interval 5.00
 
 This is the thread that syncs your backing store, and it always sets
 itself to get woken up at 5-second intervals — but here it took 5.4
 seconds, and later on in your log it takes more than 6 seconds.
 It looks like all the threads which are getting timed out are also
 idle, but are taking so much longer to wake up than they're set for
 that they get a timeout warning.
 
 There might be some bugs in here where we're expecting wakeups to be
 more precise than they can be, but these sorts of misses are
 definitely not normal. Is this server overloaded on the CPU? Have you
 done something to make the scheduler or wakeups wonky?
 -Greg

CPU load is minimal - the host does nothing but run OSDs and has 8 cores that 
are all sitting idle with a load average of 0.1. I haven't done anything to 
scheduling. That was with the debug logging on, if that could be the cause of 
any delays. A scheduler issue seems possible - I haven't done anything to it, 
but `time sleep 5` run a few times returns anything spread randomly from 5.002 
to 7.1(!) seconds but mostly in the 5.5-6.0 region where it managed fairly 
consistently 5.2 on the other servers in the cluster and 5.02 on my desktop. 
I have disabled the CPU power saving mode as the only thing I could think of 
that might be having an effect on this, and running the same test again gives 
more sane results... we'll see if this reflects in the OSD logs or not, I 
guess. If this is the cause, it's probably something that the next version 
might want to make a specific warning case of detecting. I will keep you 
updated as to their behaviour now...
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Recovering incomplete PGs with ceph_objectstore_tool

2015-04-09 Thread Chris Kitzmiller

Success! Hopefully my notes from the process will help:

In the event of multiple disk failures the cluster could lose PGs. Should this 
occur it is best to attempt to restart the OSD process and have the drive 
marked as up+out. Marking the drive as out will cause data to flow off the 
drive to elsewhere in the cluster. In the event that the ceph-osd process is 
unable to keep running you could try using the ceph_objectstore_tool program to 
extract just the damaged PGs and import them into working PGs.

Fixing Journals
In this particular scenario things were complicated by the fact that 
ceph_objectstore_tool came out in Giant but we were running Firefly. Not 
wanting to upgrade the cluster in a degraded state this required that the OSD 
drives be moved to a different physical machine for repair. This added a lot of 
steps related to the journals but it wasn't a big deal. That process looks like:

On Storage1:
stop ceph-osd id=15
ceph-osd -i 15 --flush-journal
ls -l /var/lib/ceph/osd/ceph-15/journal

Note the journal device UUID then pull the disk and move it to Ithome:
rm /var/lib/ceph/osd/ceph-15/journal
ceph-osd -i 15 --mkjournal

That creates a colocated journal for which to use during the 
ceph_objectstore_tool commands. Once done then:
ceph-osd -i 15 --flush-journal
rm /var/lib/ceph/osd/ceph-15/journal

Pull the disk and bring it back to Storage1. Then:
ln -s /dev/disk/by-partitionuuid/b4f8d911-5ac9-4bf0-a06a-b8492e25a00f 
/var/lib/ceph/osd/ceph-15/journal
ceph-osd -i 15 --mkjournal
start ceph-osd id=15

This all won't be needed once the cluster is running Hammer because then there 
will be an available version of ceph_objectstore_tool on the local machine and 
you can keep the journals throughout the process.


Recovery Process
We were missing two PGs, 3.c7 and 3.102. These PGs were hosted on OSD.0 and 
OSD.15 which were the two disks which failed out of Storage1. The disk for 
OSD.0 seemed to be a total loss while the disk for OSD.15 was somewhat more 
cooperative but not in a place to be up and running in the cluster. I took the 
dying OSD.15 drive and placed it into a new physical machine with a fresh 
install of Ceph Giant. Using Giant's ceph_objectstore_tool I was able to 
extract the PGs with a command like:
for i in 3.c7 3.102 ; do ceph_objectstore_tool --data /var/lib/ceph/osd/ceph-15 
--journal /var/lib/ceph/osd/ceph-15/journal --op export --pgid $i --file 
~/${i}.export

Once both PGs were successfully exported I attempted to import them into a new 
temporary OSD following instructions from here. For some reason that didn't 
work. The OSD was up+in but wasn't backfilling the PGs into the cluster. If you 
find yourself in this process I would try that first just in case it provides a 
cleaner process.
Considering the above didn't work and we were looking at the possibility of 
losing the RBD volume (or perhaps worse, the potential of fruitlessly fscking 
35TB) I took what I might describe as heroic measures:

Running
ceph pg dump | grep incomplete

3.c7   0  0  0  0  0  0  0  incomplete  2015-04-02  20:49:32.968841  0'0  
15730:17  [15,0]  15  [15,0]  15  13985'54076  2015-03-31  19:14:22.721695  
13985'54076  2015-03-31  19:14:22.721695
3.102  0  0  0  0  0  0  0  incomplete  2015-04-02  20:49:32.529594  0'0  
15730:21  [0,15]  0   [0,15]  0   13985'53107  2015-03-29  21:17:15.568125  
13985'49195  2015-03-24  18:38:08.244769

Then I stopped all OSDs, which blocked all I/O to the cluster, with:
stop ceph-osd-all

Then I looked for all copies of the PG on all OSDs with:
for i in 3.c7 3.102 ; do find /var/lib/ceph/osd/ -maxdepth 3 -type d -name $i 
; done | sort -V

/var/lib/ceph/osd/ceph-0/current/3.c7_head
/var/lib/ceph/osd/ceph-0/current/3.102_head
/var/lib/ceph/osd/ceph-3/current/3.c7_head
/var/lib/ceph/osd/ceph-13/current/3.102_head
/var/lib/ceph/osd/ceph-15/current/3.c7_head
/var/lib/ceph/osd/ceph-15/current/3.102_head

Then I flushed the journals for all of those OSDs with:
for i in 0 3 13 15 ; do ceph-osd -i $i --flush-journal ; done

Then I removed all of those drives and moved them (using Journal Fixing above) 
to Ithome where I used ceph_objectstore_tool to remove all traces of 3.102 and 
3.c7:
for i in 0 3 13 15 ; do for j in 3.c7 3.102 ; do ceph_objectstore_tool --data 
/var/lib/ceph/osd/ceph-$i --journal /var/lib/ceph/osd/ceph-$i/journal --op 
remove --pgid $j ; done ; done

Then I imported the PGs onto OSD.0 and OSD.15 with:
for i in 0 15 ; do for j in 3.c7 3.102 ; do ceph_objectstore_tool --data 
/var/lib/ceph/osd/ceph-$i --journal /var/lib/ceph/osd/ceph-$i/journal --op 
import --file ~/${j}.export ; done ; done
for i in 0 15 ; do ceph-osd -i $i --flush-journal  rm 
/var/log/ceph/osd/ceph-$i/journal ; done

Then I moved the disks back to Storage1 and started them all back up again. I 
think that this should have worked but what happened in this case was that 
OSD.0 didn't start up for some reason. I initially thought that that wouldn't 
matter because OSD.15 did start and

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Quentin Hartman

I'm skeptical about how well this would work, but a Banana Pi might be a
place to start. Like a raspberry pi, but it has a SATA connector:
http://www.bananapi.org/

On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg jer...@update.uu.se wrote:


 Hello ceph users,

 Is anyone running any low powered single disk nodes with Ceph now? Calxeda
 seems to be no more according to Wikipedia. I do not think HP moonshot is
 what I am looking for - I want stand-alone nodes, not server cartridges
 integrated into server chassis. And I do not want to be locked to a single
 vendor.

 I was playing with Raspberry Pi 2 for signage when I thought of my old
 experiments with Ceph.

 I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or maybe
 something with a low-power Intel x64/x86 processor. Together with one SSD
 or one low power HDD the node could get all power via PoE (via splitter or
 integrated into board if such boards exist). PoE provide remote power-on
 power-off even for consumer grade nodes.

 The cost for a single low power node should be able to compete with
 traditional PC-servers price per disk. Ceph take care of redundancy.

 I think simple custom casing should be good enough - maybe just strap or
 velcro everything on trays in the rack, at least for the nodes with SSD.

 Kind regards,
 --
 Jerker Nyberg, Uppsala, Sweden.
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] protocol feature mismatch after upgrading to Hammer

2015-04-09 Thread Kyle Hutson

Here 'tis:
https://dpaste.de/POr1


On Thu, Apr 9, 2015 at 9:49 AM, Gregory Farnum g...@gregs42.com wrote:

 Can you dump your crush map and post it on pastebin or something?

 On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson kylehut...@ksu.edu wrote:
  Nope - it's 64-bit.
 
  (Sorry, I missed the reply-all last time.)
 
  On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum g...@gregs42.com wrote:
 
  [Re-added the list]
 
  Hmm, I'm checking the code and that shouldn't be possible. What's your
  ciient? (In particular, is it 32-bit? That's the only thing i can
  think of that might have slipped through our QA.)
 
  On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson kylehut...@ksu.edu wrote:
   I did nothing to enable anything else. Just changed my ceph repo from
   'giant' to 'hammer', then did 'yum update' and restarted services.
  
   On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum g...@gregs42.com
 wrote:
  
   Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
   cluster unless you made changes to the layout requiring it.
  
   If you did, the clients have to be upgraded to understand it. You
   could disable all the v4 features; that should let them connect
 again.
   -Greg
  
   On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson kylehut...@ksu.edu
 wrote:
This particular problem I just figured out myself ('ceph -w' was
still
running from before the upgrade, and ctrl-c and restarting solved
that
issue), but I'm still having a similar problem on the ceph client:
   
libceph: mon19 10.5.38.20:6789 feature set mismatch, my
 2b84a042aca 
server's 102b84a042aca, missing 1
   
It appears that even the latest kernel doesn't have support for
CEPH_FEATURE_CRUSH_V4
   
How do I make my ceph cluster backward-compatible with the old
 cephfs
client?
   
On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson kylehut...@ksu.edu
wrote:
   
I upgraded from giant to hammer yesterday and now 'ceph -w' is
constantly
repeating this message:
   
2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478
 
10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0
 l=1
c=0x7f95e0023670).connect protocol feature mismatch, my
 3fff

peer
13fff missing 1
   
It isn't always the same IP for the destination - here's another:
2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478
 
10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0
 l=1
c=0x7f95e002b480).connect protocol feature mismatch, my
 3fff

peer
13fff missing 1
   
Some details about our install:
We have 24 hosts with 18 OSDs each. 16 per host are spinning disks
in
an
erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions
used
for a
caching tier in front of the EC pool. All 24 hosts are monitors. 4
hosts are
mds. We are running cephfs with a client trying to write data over
cephfs
when we're seeing these messages.
   
Any ideas?
   
   
   
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
   
  
  
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Rebuild bucket index

2015-04-09 Thread Laurent Barbe


Hello ceph users,

Do you know a way to rebuild a bucket index ?
I would like to change the num_shards for an existing bucket.
If I change this value in bucket meta, the new index objects are well 
created, but empty (bucket listing return null). It would be nice to be 
able to recreate the index from the objects.

Does anyone have an idea for doing this?

Thanks.

Laurent Barbe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] protocol feature mismatch after upgrading to Hammer

2015-04-09 Thread Kyle Hutson

Nope - it's 64-bit.

(Sorry, I missed the reply-all last time.)

On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum g...@gregs42.com wrote:

 [Re-added the list]

 Hmm, I'm checking the code and that shouldn't be possible. What's your
 ciient? (In particular, is it 32-bit? That's the only thing i can
 think of that might have slipped through our QA.)

 On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson kylehut...@ksu.edu wrote:
  I did nothing to enable anything else. Just changed my ceph repo from
  'giant' to 'hammer', then did 'yum update' and restarted services.
 
  On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum g...@gregs42.com wrote:
 
  Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
  cluster unless you made changes to the layout requiring it.
 
  If you did, the clients have to be upgraded to understand it. You
  could disable all the v4 features; that should let them connect again.
  -Greg
 
  On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson kylehut...@ksu.edu wrote:
   This particular problem I just figured out myself ('ceph -w' was still
   running from before the upgrade, and ctrl-c and restarting solved that
   issue), but I'm still having a similar problem on the ceph client:
  
   libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca 
   server's 102b84a042aca, missing 1
  
   It appears that even the latest kernel doesn't have support for
   CEPH_FEATURE_CRUSH_V4
  
   How do I make my ceph cluster backward-compatible with the old cephfs
   client?
  
   On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson kylehut...@ksu.edu
 wrote:
  
   I upgraded from giant to hammer yesterday and now 'ceph -w' is
   constantly
   repeating this message:
  
   2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 
   10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
   c=0x7f95e0023670).connect protocol feature mismatch, my 3fff
 
   peer
   13fff missing 1
  
   It isn't always the same IP for the destination - here's another:
   2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 
   10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
   c=0x7f95e002b480).connect protocol feature mismatch, my 3fff
 
   peer
   13fff missing 1
  
   Some details about our install:
   We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in
   an
   erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used
   for a
   caching tier in front of the EC pool. All 24 hosts are monitors. 4
   hosts are
   mds. We are running cephfs with a client trying to write data over
   cephfs
   when we're seeing these messages.
  
   Any ideas?
  
  
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] installing and updating while leaving osd drive data intact

2015-04-09 Thread Deneau, Tom

Referencing this old thread below, I am wondering what is the proper way
to install say new versions of ceph and start up daemons but keep
all the data on the osd drives.

I had been using ceph-deploy new which I guess creates a new cluster fsid.
Normally for my testing I had been starting with clean osd drives but
I would also like to be able to restart and leave the osd drives as is.

-- Tom


 Hi,
 I have faced a similar issue. This happens if the ceph disks aren't
 purged/cleaned completely. Clear of the contents in the /dev/sdb1 device.
 There is a file named ceph_fsid in the disk which would  have the old
 cluster's fsid. This needs to be deleted for it to work.

 Hope it helps.

 Sharmila


On Mon, May 26, 2014 at 2:52 PM, JinHwan Hwang calanchue at gmail.com wrote:

 I'm trying to install ceph 0.80.1 on ubuntu 14.04. All other things goes
 well except 'activate osd' phase. It tells me they can't find proper fsid
 when i do 'activate osd'. This is not my first time of installing ceph, and
 all the process i did was ok when i did on other(though they were ubuntu
 12.04 , virtual machines, ceph-emperor)

 ceph at ceph-mon:~$ ceph-deploy osd activate ceph-osd0:/dev/sdb1
 ceph-osd0:/dev/sdc1 ceph-osd1:/dev/sdb1 ceph-osd1:/dev/sdc1
 ...
 [ceph-osd0][WARNIN] ceph-disk: Error: No cluster conf found in /etc/ceph
 with fsid 05b994a0-20f9-48d7-8d34-107ffcb39e5b
 ..
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Gregory Farnum

On Thu, Apr 9, 2015 at 8:14 AM, Jacob Reid lists-c...@jacob-reid.co.uk wrote:
 On Thu, Apr 09, 2015 at 06:43:45AM -0700, Gregory Farnum wrote:
 You can turn up debugging (debug osd = 10 and debug filestore = 10
 are probably enough, or maybe 20 each) and see what comes out to get
 more information about why the threads are stuck.

 But just from the log my answer is the same as before, and now I don't
 trust that controller (or maybe its disks), regardless of what it's
 admitting to. ;)
 -Greg


 Ran with osd and filestore debug both at 20; still nothing jumping out at me. 
 Logfile attached as it got huge fairly quickly, but mostly seems to be the 
 same extra lines. I tried running some test I/O on the drives in question to 
 try and provoke some kind of problem, but they seem fine now...

Okay, this is strange. Something very wonky is happening with your
scheduler — it looks like these threads are all idle, and they're
scheduling wakeups that handle an appreciable amount of time after
they're supposed to. For instance:
2015-04-09 15:56:55.953116 7f70a7963700 20
filestore(/var/lib/ceph/osd/osd.15) sync_entry woke after 5.416704
2015-04-09 15:56:55.953153 7f70a7963700 20
filestore(/var/lib/ceph/osd/osd.15) sync_entry waiting for
max_interval 5.00

This is the thread that syncs your backing store, and it always sets
itself to get woken up at 5-second intervals — but here it took 5.4
seconds, and later on in your log it takes more than 6 seconds.
It looks like all the threads which are getting timed out are also
idle, but are taking so much longer to wake up than they're set for
that they get a timeout warning.

There might be some bugs in here where we're expecting wakeups to be
more precise than they can be, but these sorts of misses are
definitely not normal. Is this server overloaded on the CPU? Have you
done something to make the scheduler or wakeups wonky?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Quentin Hartman

Where's the take my money button?

On Thu, Apr 9, 2015 at 9:43 AM, Mark Nelson mnel...@redhat.com wrote:

 How about drives that run Linux with an ARM processor, RAM, and an
 ethernet port right on the drive?  Notice the Ceph logo. :)

 https://www.hgst.com/science-of-storage/emerging-
 technologies/open-ethernet-drive-architecture

 Mark

 On 04/09/2015 10:37 AM, Scott Laird wrote:

 Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB)
 Ethernet port.


 On Thu, Apr 9, 2015, 8:03 AM p...@philw.com mailto:p...@philw.com
 p...@philw.com mailto:p...@philw.com wrote:

 Rather expensive option:

 Applied Micro X-Gene, overkill for a single disk, and only really
 available in a
 development kit format right now.

 https://www.apm.com/products/__data-center/x-gene-family/x-
 __c1-development-kits/
 https://www.apm.com/products/data-center/x-gene-family/x-
 c1-development-kits/

 Better Option:

 Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks,
 and one
 node with mSATA SSD

 http://www.ambedded.com.tw/__pt_list.php?CM_ID=20140214001
 http://www.ambedded.com.tw/pt_list.php?CM_ID=20140214001

 --phil

   On 09 April 2015 at 15:57 Quentin Hartman
 qhart...@direwolfdigital.com mailto:qhart...@direwolfdigital.com
   wrote:
  
I'm skeptical about how well this would work, but a Banana Pi
 might be a
   place to start. Like a raspberry pi, but it has a SATA connector:
   http://www.bananapi.org/
  
On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg
 jer...@update.uu.se mailto:jer...@update.uu.se
   mailto:jer...@update.uu.se mailto:jer...@update.uu.se  wrote:
   Hello ceph users,
   
   Is anyone running any low powered single disk nodes with
 Ceph now?
Calxeda seems to be no more according to Wikipedia. I do not
 think HP
moonshot is what I am looking for - I want stand-alone nodes,
 not server
cartridges integrated into server chassis. And I do not want to
 be locked to
a single vendor.
   
   I was playing with Raspberry Pi 2 for signage when I thought
 of my old
experiments with Ceph.
   
   I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or
 maybe
something with a low-power Intel x64/x86 processor. Together
 with one SSD or
one low power HDD the node could get all power via PoE (via
 splitter or
integrated into board if such boards exist). PoE provide remote
 power-on
power-off even for consumer grade nodes.
   
   The cost for a single low power node should be able to
 compete with
traditional PC-servers price per disk. Ceph take care of
 redundancy.
   
   I think simple custom casing should be good enough - maybe
 just strap or
velcro everything on trays in the rack, at least for the nodes
 with SSD.
   
   Kind regards,
   --
   Jerker Nyberg, Uppsala, Sweden.
   _
   ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 mailto:ceph-us...@lists.ceph.__com mailto:ceph-users@lists.ceph.com
 
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
   _
ceph-users mailing list
   ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
   http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  


 _
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Gregory Farnum

On Thu, Apr 9, 2015 at 4:56 AM, Patrik Plank pat...@plank.me wrote:
 Hi,


 i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm
 pool.

 these are my settings :


 ceph osd tier add kvm cache-pool

 ceph osd tier cache-mode cache-pool writeback

 ceph osd tier set-overlay kvm cache-pool


 ceph osd pool set cache-pool hit_set_type bloom

 ceph osd pool set cache-pool hit_set_count 1

 ceph osd pool set cache-pool hit set period 3600


 ceph osd pool set cache-pool target_max_bytes 751619276800

 ^ 750 GB. For 3*512GB disks that's too large a target value.


 ceph osd pool set cache-pool target_max_objects 100


 ceph osd pool set cache-pool cache_min_flush_age 1800

 ceph osd pool set cache-pool cache_min_evict_age 600


 ceph osd pool cache-pool cache_target_dirty_ratio .4

 ceph osd pool cache-pool cache target_full_ratio .8


 So the problem is, the cache-tier do no evict automatically.

 If i copy some kvm images to the ceph cluster, the cache osds always run
 full.


 Is that normal?

 Is there a miss configuration?


 thanks

 best regards

 Patrik







 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] use ZFS for OSDs

2015-04-09 Thread Michal Kozanecki

I had surgery and have been off for a while. Had to rebuild test ceph+openstack 
cluster with whatever spare parts I had. I apologize for the delay for anyone 
who's been interested.

Here are the results;
==
Hardware/Software
3 node CEPH cluster, 3 OSDs (one OSD per node)
--
CPU = 1x E5-2670 v1
RAM = 8GB
OS Disk = 500GB SATA
OSD = 900GB 10k SAS (sdc - whole device)
Journal = Shared Intel SSD DC3500 80GB (sdb1 - 10GB partition)
ZFS log = Shared Intel SSD DC3500 80GB (sdb2 - 4GB partition)
ZFS L2ARC = Intel SSD 320 40GB (sdd - whole device)  
-
ceph 0.87
ZoL 0.63
CentOS 7.0

2 node KVM/Openstack cluster

CPU = 2x Xeon X5650
RAM = 24 GB
OS Disk = 500GB SATA
-
Ubuntu 14.04
OpenStack Juno

the rough performance of this oddball sized test ceph cluster is 8k 1000-1500 
IOPS 

==
Compression; (cut out unneeded details)
Various Debian and CentOS images, with lots of test SVN and GIT data 
KVM/OpenStack

[root@ceph03 ~]# zfs get all SAS1
NAME  PROPERTY  VALUE  SOURCE
SAS1  used  586G   -
SAS1  compressratio 1.50x  -
SAS1  recordsize32Klocal
SAS1  checksum  on default
SAS1  compression   lz4local
SAS1  refcompressratio  1.50x  -
SAS1  written   586G   -
SAS1  logicalused   877G   -

==
Dedupe; (dedupe is enabled on a dataset level but can dedupe space savings only 
be viewed at a pool level - bit odd I know)
Various Debian and CentOS images, with lots of test SVN and GIT data 
KVM/OpenStack

[root@ceph01 ~]# zpool get all SAS1
NAME  PROPERTY   VALUE  SOURCE
SAS1  size   836G   -
SAS1  capacity   70%-
SAS1  dedupratio 1.02x  -
SAS1  free   250G   -
SAS1  allocated  586G   -

==
Bitrot/Corruption;
Injected random data to random locations (changed seek to random value) of sdc 
with;

dd if=/dev/urandom of=/dev/sdc seek=54356 bs=4k count=1

Results;

1. ZFS detects error on disk affecting PG files, being as this is a single vdev 
(no zraid or mirror) it cannot automatically fix. It blocks all(but delete) 
access to the entire files(inaccessible). 
*note: I ran this after status after already repairing 2 PGs (5.15 and 5.25), 
ZFS status will no longer list filename after it has been 
repaired/deleted/cleared*



[root@ceph01 ~]# zpool status -v
  pool: SAS1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub in progress since Thu Apr  9 13:04:54 2015
153G scanned out of 586G at 40.3M/s, 3h3m to go
0 repaired, 26.05% done
config:

NAME  STATE READ WRITE CKSUM
SAS1  ONLINE   0 035
  sdc ONLINE   0 070
logs
  sdb2ONLINE   0 0 0
cache
  sdd ONLINE   0 0 0

errors: Permanent errors have been detected in the following files: 

/SAS1/current/5.e_head/DIR_E/DIR_0/DIR_6/rbd\udata.2ba762ae8944a.24cc__head_6153260E__5



2. CEPH-OSD cannot read PG file. Kicks off scrub/deep-scrub



/var/log/ceph/ceph-osd.2.log
2015-04-09 13:10:18.319312 7fcbb163a700 -1 log_channel(default) log [ERR] : 
5.18 shard 1: soid cd635018/rbd_data.93d1f74b0dc51.18ee/head//5 
candidate had a read error, digest 1835988768 != known digest 473354757
2015-04-09 13:11:38.587014 7fcbb1e3b700 -1 log_channel(default) log [ERR] : 
5.18 deep-scrub 0 missing, 1 inconsistent objects
2015-04-09 13:11:38.587020 7fcbb1e3b700 -1 log_channel(default) log [ERR] : 
5.18 deep-scrub 1 errors

/var/log/ceph/ceph-osd.1.log
2015-04-09 13:11:43.640499 7fe10b3c5700 -1 log_channel(default) log [ERR] : 
5.25 shard 1: soid 73eb0125/rbd_data.5315b2ae8944a.5348/head//5 
candidate had a read error, digest 1522345897 != known digest 1180025616
2015-04-09 13:12:44.781546 7fe10abc4700 -1 log_channel(default) log [ERR] : 
5.25 deep-scrub 0 missing, 1 inconsistent objects
2015-04-09 13:12:44.781553 7fe10abc4700 -1 log_channel(default) log [ERR] : 
5.25 deep-scrub 1 errors

---

3. CEPH STATUS reports an error

---

[root@client01 ~]# ceph status

cluster e93ce4d3-3a46-4082-9ec5-e23c82ca616e
 health HEALTH_WARN 2

Re: [ceph-users] ceph-osd failure following 0.92 - 0.94 upgrade

2015-04-09 Thread Gregory Farnum

On Thu, Apr 9, 2015 at 2:05 PM, Dirk Grunwald
dirk.grunw...@colorado.edu wrote:
 Ceph cluster, U14.10 base system, OSD's using BTRFS, journal on same disk as
 partition
 (done using ceph-deploy)

 I had been running 0.92 without (significant) issue. I upgraded
 to Hammer (0.94) be modifying /etc/apt/sources.list, apt-get update, apt-get
 upgrade

 Upgraded and restarted ceph-mon and then ceph-osd

 Most of the 50 OSD's are in a failure cycle with the error
 os/Transaction.cc: 504: FAILED assert(ops == data.ops)

 Right now, the entire cluster is useless because of this.

 Any suggestions?

It looks like maybe it's under the v80.x section instead of general
upgrading, but the release notes include:

* If you are upgrading specifically from v0.92, you must stop all OSD
  daemons and flush their journals (``ceph-osd -i NNN
  --flush-journal``) before upgrading.  There was a transaction
  encoding bug in v0.92 that broke compatibility.  Upgrading from v0.93,
  v0.91, or anything earlier is safe.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Motherboard recommendation?

2015-04-09 Thread Mohamed Pakkeer

Hi Markus,

X10DRH-CT can support only 16 drive as default. If you want to connect more
drive,there is a special SKU for more drive support from super micro or you
need additional SAS controller. We are using 2630 V3( 8 core - 2.4GHz) *2
for 30 drives on SM X10DRI-T. It is working perfectly on replication based
cluster. If you are planning to use erasure coding, you have to think about
higher spec.

Does any one know about the exact processor requirement of 30 drives node
for erasure coding? . I can't find suitable hardware recommendation for
erasure coding.

Cheers
K.Mohamed Pakkeer




On Thu, Apr 9, 2015 at 1:30 PM, Markus Goldberg goldb...@uni-hildesheim.de
wrote:

 Hi,
 i have a backup-storage with ceph 0,93
 As every backup-system it is only been written and hopefully never read.

 The hardware is 3 Supermicro SC847-cases with 30 SATA-HDDS each (2- and
 4-TB-WD-disks) = 250TB
 I have realized, that the motherboards and CPUs are totally undersized, so
 i want to install new boards.
 I'm thinking of the following:
 3 Supermicro X10DRH-CT or X10DRC-T4+ with 128GB memory each.
 What do you think about these boards? Will they fit into the SC847?
 They have SAS and 10G-Base-T onboard, so no extra controller seems to be
 necessary.
 What Xeon-v3 should i take, how many cores?
 Does anyone know if M.2-SSDs are supported in their pci-e-slots?

 Thank you very much,
   Markus

 --
 Markus Goldberg   Universität Hildesheim
   Rechenzentrum
 Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
 Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
 --

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Thanks  Regards
K.Mohamed Pakkeer
Mobile- 0091-8754410114
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Firefly - Giant : CentOS 7 : install failed ceph-deploy

2015-04-09 Thread Ken Dreyer

On 04/08/2015 03:00 PM, Travis Rhoden wrote:
 Hi Vickey,
 
 The easiest way I know of to get around this right now is to add the
 following line in section for epel in /etc/yum.repos.d/epel.repo
 
 exclude=python-rados python-rbd
 
 So this is what my epel.repo file looks like: http://fpaste.org/208681/
 
 It is those two packages in EPEL that are causing problems.  I also
 tried enabling epel-testing, but that didn't work either.

My wild guess is that enabling epel-testing is not enough, because the
offending 0.80.7-0.4.el7 build in the stable EPEL repository is still
visible to yum.

When you set that exclude= parameter in /etc/yum.repos.d/epel.repo,
like exclude=python-rados python-rbd python-cephfs, *and* also try
--enablerepo=epel-testing, does it work?

- Ken
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Gregory Farnum

You can turn up debugging (debug osd = 10 and debug filestore = 10
are probably enough, or maybe 20 each) and see what comes out to get
more information about why the threads are stuck.

But just from the log my answer is the same as before, and now I don't
trust that controller (or maybe its disks), regardless of what it's
admitting to. ;)
-Greg

On Thu, Apr 9, 2015 at 1:28 AM, Jacob Reid lists-c...@jacob-reid.co.uk wrote:
 On Wed, Apr 08, 2015 at 03:42:29PM +, Gregory Farnum wrote:
 Im on my phone so can't check exactly what those threads are trying to do,
 but the osd has several threads which are stuck. The FileStore threads are
 certainly trying to access the disk/local filesystem. You may not have a
 hardware fault, but it looks like something in your stack is not behaving
 when the osd asks the filesystem to do something. Check dmesg, etc.
 -Greg


 Noticed a bit in dmesg that seems to be controller-related (HP Smart Array 
 P420i) where I/O was hanging in some cases[1]; fixed by updating from 5.42 to 
 6.00

 [1] http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c03555882

 In dmesg:
 [11775.779477] hpsa :08:00.0: ABORT REQUEST on C1:B0:T0:L0 
 Tag:0x:0010 Command:0x2a SN:0x49fb  REQUEST SUCCEEDED.
 [11812.170350] hpsa :08:00.0: Abort request on C1:B0:T0:L0
 [11817.386773] hpsa :08:00.0: cp 880522bff000 is reported invalid 
 (probably means target device no longer present)
 [11817.386784] hpsa :08:00.0: ABORT REQUEST on C1:B0:T0:L0 
 Tag:0x:0010 Command:0x2a SN:0x4a13  REQUEST SUCCEEDED.

 The problem still appears to be persisting in the cluster, although I am no 
 longer seeing the disk-related errors in dmesg, I am still getting errors in 
 the osd logs:

 2015-04-08 17:24:15.024820 7f0f21e9f700  1 heartbeat_map reset_timeout 
 'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
 2015-04-08 17:24:15.025043 7f0f2169e700  1 heartbeat_map reset_timeout 
 'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
 2015-04-08 17:48:33.146399 7f0f21e9f700  1 heartbeat_map reset_timeout 
 'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
 2015-04-08 17:48:33.146439 7f0f2169e700  1 heartbeat_map reset_timeout 
 'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
 2015-04-08 18:55:31.107727 7f0f16740700  1 heartbeat_map reset_timeout 
 'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
 2015-04-08 18:55:31.107774 7f0f2169e700  1 heartbeat_map reset_timeout 
 'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
 2015-04-08 18:55:31.107789 7f0f21e9f700  1 heartbeat_map reset_timeout 
 'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
 2015-04-08 18:55:31.108225 7f0f29eaf700  1 heartbeat_map is_healthy 
 'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
 2015-04-08 18:55:31.108268 7f0f15f3f700  1 heartbeat_map reset_timeout 
 'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
 2015-04-08 18:55:31.108272 7f0f29eaf700  1 heartbeat_map is_healthy 
 'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
 2015-04-08 18:55:31.108281 7f0f29eaf700  1 heartbeat_map is_healthy 
 'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
 2015-04-08 18:55:31.108285 7f0f1573e700  1 heartbeat_map reset_timeout 
 'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
 2015-04-08 18:55:31.108345 7f0f16f41700  1 heartbeat_map reset_timeout 
 'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
 2015-04-08 18:55:31.108378 7f0f17742700  1 heartbeat_map reset_timeout 
 'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
 2015-04-08 19:01:20.694897 7f0f15f3f700  1 heartbeat_map reset_timeout 
 'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
 2015-04-08 19:01:20.694928 7f0f17742700  1 heartbeat_map reset_timeout 
 'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
 2015-04-08 19:01:20.694970 7f0f16f41700  1 heartbeat_map reset_timeout 
 'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
 2015-04-08 19:01:20.695544 7f0f1573e700  1 heartbeat_map reset_timeout 
 'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
 2015-04-08 19:01:20.695665 7f0f16740700  1 heartbeat_map reset_timeout 
 'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
 2015-04-08 19:01:34.979288 7f0f1573e700  1 heartbeat_map reset_timeout 
 'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
 2015-04-08 19:01:34.979498 7f0f21e9f700  1 heartbeat_map reset_timeout 
 'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
 2015-04-08 19:01:34.979513 7f0f16f41700  1 heartbeat_map reset_timeout 
 'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
 2015-04-08 19:01:34.979535 7f0f2169e700  1 heartbeat_map reset_timeout 
 'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
 2015-04-08 19:01:34.980021 7f0f15f3f700  1 heartbeat_map reset_timeout 
 'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
 2015-04-08

Re: [ceph-users] protocol feature mismatch after upgrading to Hammer

2015-04-09 Thread Kyle Hutson

This particular problem I just figured out myself ('ceph -w' was still
running from before the upgrade, and ctrl-c and restarting solved that
issue), but I'm still having a similar problem on the ceph client:

libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca 
server's 102b84a042aca, missing 1

It appears that even the latest kernel doesn't have support
for CEPH_FEATURE_CRUSH_V4

How do I make my ceph cluster backward-compatible with the old cephfs
client?

On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson kylehut...@ksu.edu wrote:

 I upgraded from giant to hammer yesterday and now 'ceph -w' is constantly
 repeating this message:

 2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 
 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
 c=0x7f95e0023670).connect protocol feature mismatch, my 3fff  peer
 13fff missing 1

 It isn't always the same IP for the destination - here's another:
 2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 
 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
 c=0x7f95e002b480).connect protocol feature mismatch, my 3fff  peer
 13fff missing 1

 Some details about our install:
 We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in an
 erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used for a
 caching tier in front of the EC pool. All 24 hosts are monitors. 4 hosts
 are mds. We are running cephfs with a client trying to write data over
 cephfs when we're seeing these messages.

 Any ideas?

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Chu Duc Minh

What ceph version do you use?

Regards,
On 9 Apr 2015 18:58, Patrik Plank pat...@plank.me wrote:

  Hi,


 i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm
 pool.

 these are my settings :


 ceph osd tier add kvm cache-pool

 ceph osd tier cache-mode cache-pool writeback

 ceph osd tier set-overlay kvm cache-pool


 ceph osd pool set cache-pool hit_set_type bloom

 ceph osd pool set cache-pool hit_set_count 1

 ceph osd pool set cache-pool hit set period 3600


 ceph osd pool set cache-pool target_max_bytes 751619276800

 ceph osd pool set cache-pool target_max_objects 100


 ceph osd pool set cache-pool cache_min_flush_age 1800

 ceph osd pool set cache-pool cache_min_evict_age 600


 ceph osd pool cache-pool cache_target_dirty_ratio .4

 ceph osd pool cache-pool cache target_full_ratio .8


 So the problem is, the cache-tier do no evict automatically.

 If i copy some kvm images to the ceph cluster, the cache osds always run
 full.


 Is that normal?

 Is there a miss configuration?


 thanks

 best regards

 Patrik







 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] low power single disk nodes

2015-04-09 Thread Jerker Nyberg



Hello ceph users,

Is anyone running any low powered single disk nodes with Ceph now? Calxeda 
seems to be no more according to Wikipedia. I do not think HP moonshot is 
what I am looking for - I want stand-alone nodes, not server cartridges 
integrated into server chassis. And I do not want to be locked to a single 
vendor.


I was playing with Raspberry Pi 2 for signage when I thought of my old 
experiments with Ceph.


I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or maybe 
something with a low-power Intel x64/x86 processor. Together with one SSD 
or one low power HDD the node could get all power via PoE (via splitter or 
integrated into board if such boards exist). PoE provide remote power-on 
power-off even for consumer grade nodes.


The cost for a single low power node should be able to compete with 
traditional PC-servers price per disk. Ceph take care of redundancy.


I think simple custom casing should be good enough - maybe just strap or 
velcro everything on trays in the rack, at least for the nodes with SSD.


Kind regards,
--
Jerker Nyberg, Uppsala, Sweden.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph Hammer : Ceph-deploy 1.5.23-0 : RGW civetweb :: Not getting installed

2015-04-09 Thread Vickey Singh

Hello Cephers

I am trying to setup RGW using Ceph-deploy which is described here

http://docs.ceph.com/docs/master/start/quick-ceph-deploy/#add-an-rgw-instance


But unfortunately it doesn't seems to be working

Is there something i am missing  or you know some fix for this.




[root@ceph-node1 yum.repos.d]# ceph -v

*ceph version 0.94* (e61c4f093f88e44961d157f65091733580cea79a)

[root@ceph-node1 yum.repos.d]#



# yum update ceph-deploy


 SKIPPED 



  Verifying  : ceph-deploy-1.5.22-0.noarch
2/2


Updated:

 * ceph-deploy.noarch 0:1.5.23-0*


Complete!

[root@ceph-node1 ceph]#





[root@ceph-node1 ceph]# ceph-deploy rgw create rgw-node1

[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf

[ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy rgw create
rgw-node1

[ceph_deploy.rgw][DEBUG ] Deploying rgw, cluster ceph hosts
rgw-node1:rgw.rgw-node1

*[ceph_deploy][ERROR ] RuntimeError: bootstrap-rgw keyring not found; run
'gatherkeys'*




[root@ceph-node1 ceph]# ceph-deploy --overwrite-conf mon create-initial

[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf

[ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy
--overwrite-conf mon create-initial

 SKIPPED 

[ceph_deploy.mon][INFO  ] mon.ceph-node1 monitor has reached quorum!

[ceph_deploy.mon][INFO  ] all initial monitors are running and have formed
quorum

[ceph_deploy.mon][INFO  ] Running gatherkeys...

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-osd.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-mds.keyring

[ceph_deploy.gatherkeys][DEBUG ] Checking ceph-node1 for
/var/lib/ceph/bootstrap-rgw/ceph.keyring

[ceph-node1][DEBUG ] connected to host: ceph-node1

[ceph-node1][DEBUG ] detect platform information from remote host

[ceph-node1][DEBUG ] detect machine type

[ceph-node1][DEBUG ] fetch remote file

*[ceph_deploy.gatherkeys][WARNIN] Unable to find
/var/lib/ceph/bootstrap-rgw/ceph.keyring on ceph-node1*

*[ceph_deploy.gatherkeys][WARNIN] No RGW bootstrap key found. Will not be
able to deploy RGW daemons*

[root@ceph-node1 ceph]#



[root@ceph-node1 ceph]# ceph-deploy gatherkeys ceph-node1

[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf

[ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy gatherkeys
ceph-node1

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-osd.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-mds.keyring

[ceph_deploy.gatherkeys][DEBUG ] Checking ceph-node1 for
/var/lib/ceph/bootstrap-rgw/ceph.keyring

[ceph-node1][DEBUG ] connected to host: ceph-node1

[ceph-node1][DEBUG ] detect platform information from remote host

[ceph-node1][DEBUG ] detect machine type

[ceph-node1][DEBUG ] fetch remote file

*[ceph_deploy.gatherkeys][WARNIN] Unable to find
/var/lib/ceph/bootstrap-rgw/ceph.keyring on ceph-node1*

*[ceph_deploy.gatherkeys][WARNIN] No RGW bootstrap key found. Will not be
able to deploy RGW daemons*

[root@ceph-node1 ceph]#



Regards

VS
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Stillwell, Bryan

These are really interesting to me, but how can you buy them?  What's the
performance like in ceph?  Are they using the keyvaluestore backend, or
something specific to these drives?  Also what kind of chassis do they go
into (some kind of ethernet JBOD)?

Bryan

On 4/9/15, 9:43 AM, Mark Nelson mnel...@redhat.com wrote:

How about drives that run Linux with an ARM processor, RAM, and an
ethernet port right on the drive?  Notice the Ceph logo. :)

https://www.hgst.com/science-of-storage/emerging-technologies/open-etherne
t-drive-architecture

Mark

On 04/09/2015 10:37 AM, Scott Laird wrote:
 Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB)
 Ethernet port.


 On Thu, Apr 9, 2015, 8:03 AM p...@philw.com mailto:p...@philw.com
 p...@philw.com mailto:p...@philw.com wrote:

 Rather expensive option:

 Applied Micro X-Gene, overkill for a single disk, and only really
 available in a
 development kit format right now.


https://www.apm.com/products/__data-center/x-gene-family/x-__c1-developm
ent-kits/

https://www.apm.com/products/data-center/x-gene-family/x-c1-development-
kits/

 Better Option:

 Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks,
 and one
 node with mSATA SSD

 http://www.ambedded.com.tw/__pt_list.php?CM_ID=20140214001
 http://www.ambedded.com.tw/pt_list.php?CM_ID=20140214001

 --phil

   On 09 April 2015 at 15:57 Quentin Hartman
 qhart...@direwolfdigital.com mailto:qhart...@direwolfdigital.com
   wrote:
  
I'm skeptical about how well this would work, but a Banana Pi
 might be a
   place to start. Like a raspberry pi, but it has a SATA connector:
   http://www.bananapi.org/
  
On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg
 jer...@update.uu.se mailto:jer...@update.uu.se
   mailto:jer...@update.uu.se mailto:jer...@update.uu.se 
wrote:
   Hello ceph users,
   
   Is anyone running any low powered single disk nodes with
 Ceph now?
Calxeda seems to be no more according to Wikipedia. I do not
 think HP
moonshot is what I am looking for - I want stand-alone nodes,
 not server
cartridges integrated into server chassis. And I do not want to
 be locked to
a single vendor.
   
   I was playing with Raspberry Pi 2 for signage when I thought
 of my old
experiments with Ceph.
   
   I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or
 maybe
something with a low-power Intel x64/x86 processor. Together
 with one SSD or
one low power HDD the node could get all power via PoE (via
 splitter or
integrated into board if such boards exist). PoE provide remote
 power-on
power-off even for consumer grade nodes.
   
   The cost for a single low power node should be able to
 compete with
traditional PC-servers price per disk. Ceph take care of
 redundancy.
   
   I think simple custom casing should be good enough - maybe
 just strap or
velcro everything on trays in the rack, at least for the nodes
 with SSD.
   
   Kind regards,
   --
   Jerker Nyberg, Uppsala, Sweden.
   _
   ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 mailto:ceph-us...@lists.ceph.__com
mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
   _
ceph-users mailing list
   ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
   http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  


 _
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


This E-mail and any of its attachments may contain Time Warner Cable 
proprietary information, which is privileged, confidential, or subject to 
copyright belonging to Time Warner Cable. This E-mail is intended solely for 
the use of the individual or entity to which it is addressed. If you are not 
the intended recipient of this

Re: [ceph-users] RBD hard crash on kernel 3.10

2015-04-09 Thread Shawn Edwards

Thanks for the pointer to the patched kernel.  I'll give that a shot.

On Thu, Apr 9, 2015, 5:56 AM Ilya Dryomov idryo...@gmail.com wrote:

 On Wed, Apr 8, 2015 at 5:25 PM, Shawn Edwards lesser.e...@gmail.com
 wrote:
  We've been working on a storage repository for xenserver 6.5, which uses
 the
  3.10 kernel (ug).  I got the xenserver guys to include the rbd and
 libceph
  kernel modules into the 6.5 release, so that's at least available.
 
  Where things go bad is when we have many (10 or so) VMs on one host, all
  using RBD clones for the storage mapped using the rbd kernel module.  The
  Xenserver crashes so badly that it doesn't even get a chance to kernel
  panic.  The whole box just hangs.

 I'm not very familiar with Xen and ways to debug it but if the problem
 lies in libceph or rbd kernel modules we'd like to fix it.  Perhaps try
 grabbing a vmcore?  If it just hangs and doesn't panic you can normally
 induce a crash with a sysrq.

 
  Has anyone else seen this sort of behavior?
 
  We have a lot of ways to try to work around this, but none of them are
 very
  pretty:
 
  * move the code to user space, ditch the kernel driver:  The build tools
 for
  Xenserver are all CentOS5 based, and it is painful to get all of the deps
  built to get the ceph user space libs built.
 
  * backport the ceph and rbd kernel modules to 3.10.  Has proven painful,
 as
  the block device code changed somewhere in the 3.14-3.16 timeframe.

 https://github.com/ceph/ceph-client/commits/rhel7-3.10.0-123.9.3 branch
 would be a good start - it has libceph.ko and rbd.ko as of 3.18-rc5
 backported to rhel7 (which is based on 3.10) and may be updated in the
 future as well, although no promises on that.

 Thanks,

 Ilya

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RBD hard crash on kernel 3.10

2015-04-09 Thread Ilya Dryomov

On Wed, Apr 8, 2015 at 5:25 PM, Shawn Edwards lesser.e...@gmail.com wrote:
 We've been working on a storage repository for xenserver 6.5, which uses the
 3.10 kernel (ug).  I got the xenserver guys to include the rbd and libceph
 kernel modules into the 6.5 release, so that's at least available.

 Where things go bad is when we have many (10 or so) VMs on one host, all
 using RBD clones for the storage mapped using the rbd kernel module.  The
 Xenserver crashes so badly that it doesn't even get a chance to kernel
 panic.  The whole box just hangs.

I'm not very familiar with Xen and ways to debug it but if the problem
lies in libceph or rbd kernel modules we'd like to fix it.  Perhaps try
grabbing a vmcore?  If it just hangs and doesn't panic you can normally
induce a crash with a sysrq.


 Has anyone else seen this sort of behavior?

 We have a lot of ways to try to work around this, but none of them are very
 pretty:

 * move the code to user space, ditch the kernel driver:  The build tools for
 Xenserver are all CentOS5 based, and it is painful to get all of the deps
 built to get the ceph user space libs built.

 * backport the ceph and rbd kernel modules to 3.10.  Has proven painful, as
 the block device code changed somewhere in the 3.14-3.16 timeframe.

https://github.com/ceph/ceph-client/commits/rhel7-3.10.0-123.9.3 branch
would be a good start - it has libceph.ko and rbd.ko as of 3.18-rc5
backported to rhel7 (which is based on 3.10) and may be updated in the
future as well, although no promises on that.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Firefly - Giant : CentOS 7 : install failed ceph-deploy

2015-04-09 Thread Vickey Singh

Thanks for the help guys , here is my feedback with the tests


@Michael Kidd :yum install ceph ceph-common --disablerepo=base
--disablerepo=epel

Did not worked  here are the logs http://fpaste.org/208828/56448714/


@Travis Rhoden : Yep *exclude=python-rados python-rbd*  under epel.repo did
the trick and i can install Firefly / Giant without errors. Thanks


Any idea when this be fixed once for all ( so i no longer to patch
epel.repo to exclude python-r*)


- VS -

On Thu, Apr 9, 2015 at 4:26 AM, Michael Kidd linuxk...@redhat.com wrote:

 I don't think this came through the first time.. resending.. If it's a
 dupe, my apologies..

 For Firefly / Giant installs, I've had success with the following:

 yum install ceph ceph-common --disablerepo=base --disablerepo=epel

 Let us know if this works for you as well.

 Thanks,

 Michael J. Kidd
 Sr. Storage Consultant
 Inktank Professional Services
  - by Red Hat

 On Wed, Apr 8, 2015 at 9:07 PM, Michael Kidd linuxk...@redhat.com wrote:

 For Firefly / Giant installs, I've had success with the following:

 yum install ceph ceph-common --disablerepo=base --disablerepo=epel

 Let us know if this works for you as well.

 Thanks,

 Michael J. Kidd
 Sr. Storage Consultant
 Inktank Professional Services
  - by Red Hat

 On Wed, Apr 8, 2015 at 8:55 PM, Travis Rhoden trho...@gmail.com wrote:

 I did also confirm that, as Ken mentioned, this is not a problem on
 Hammer since Hammer includes the package split (python-ceph became
 python-rados and python-rbd).

  - Travis

 On Wed, Apr 8, 2015 at 5:00 PM, Travis Rhoden trho...@gmail.com wrote:

 Hi Vickey,

 The easiest way I know of to get around this right now is to add the
 following line in section for epel in /etc/yum.repos.d/epel.repo

 exclude=python-rados python-rbd

 So this is what my epel.repo file looks like: http://fpaste.org/208681/

 It is those two packages in EPEL that are causing problems.  I also
 tried enabling epel-testing, but that didn't work either.

 Unfortunately you would need to add this line on each node where Ceph
 Giant is being installed.

  - Travis

 On Wed, Apr 8, 2015 at 4:11 PM, Vickey Singh 
 vickey.singh22...@gmail.com wrote:

 Community , need help.


 -VS-

 On Wed, Apr 8, 2015 at 4:36 PM, Vickey Singh 
 vickey.singh22...@gmail.com wrote:

 Any suggestion  geeks


 VS

 On Wed, Apr 8, 2015 at 2:15 PM, Vickey Singh 
 vickey.singh22...@gmail.com wrote:


 Hi


 The below suggestion also didn’t worked


 Full logs here : http://paste.ubuntu.com/10771939/




 [root@rgw-node1 yum.repos.d]# yum --showduplicates list ceph

 Loaded plugins: fastestmirror, priorities

 Loading mirror speeds from cached hostfile

  * base: mirror.zetup.net

  * epel: ftp.fi.muni.cz

  * extras: mirror.zetup.net

  * updates: mirror.zetup.net

 25 packages excluded due to repository priority protections

 Available Packages

 ceph.x86_64
 0.80.6-0.el7.centos
 Ceph

 ceph.x86_64
 0.80.7-0.el7.centos
 Ceph

 ceph.x86_64
 0.80.8-0.el7.centos
 Ceph

 ceph.x86_64
 0.80.9-0.el7.centos
 Ceph

 [root@rgw-node1 yum.repos.d]#





 Its not able to install latest available package , yum is getting
 confused with other DOT releases.


 Any other suggestion to fix this ???



 -- Processing Dependency: libboost_system-mt.so.1.53.0()(64bit) for
 package: librbd1-0.80.9-0.el7.centos.x86_64

 -- Processing Dependency: libboost_thread-mt.so.1.53.0()(64bit) for
 package: librbd1-0.80.9-0.el7.centos.x86_64

 -- Finished Dependency Resolution

 Error: Package: librbd1-0.80.7-0.el7.centos.x86_64 (Ceph)

Requires: libboost_system-mt.so.1.53.0()(64bit)

 Error: Package: ceph-0.80.7-0.el7.centos.x86_64 (Ceph)

Requires: libboost_system-mt.so.1.53.0()(64bit)

 Error: Package: ceph-0.80.7-0.el7.centos.x86_64 (Ceph)

Requires: libaio.so.1(LIBAIO_0.4)(64bit)

 Error: Package: ceph-common-0.80.7-0.el7.centos.x86_64 (Ceph)

Requires: libboost_thread-mt.so.1.53.0()(64bit)

 Error: Package: ceph-common-0.80.7-0.el7.centos.x86_64 (Ceph)

Requires: librados2 = 0.80.7-0.el7.centos

Available: librados2-0.80.6-0.el7.centos.x86_64 (Ceph)

librados2 = 0.80.6-0.el7.centos

Available: librados2-0.80.7-0.el7.centos.x86_64 (Ceph)

librados2 = 0.80.7-0.el7.centos

Available: librados2-0.80.8-0.el7.centos.x86_64 (Ceph)

librados2 = 0.80.8-0.el7.centos

Installing: librados2-0.80.9-0.el7.centos.x86_64 (Ceph)

librados2 = 0.80.9-0.el7.centos

 Error: Package: libcephfs1-0.80.7-0.el7.centos.x86_64 (Ceph)

Requires: libboost_thread-mt.so.1.53.0()(64bit)

 Error: Package: ceph-common-0.80.7-0.el7.centos.x86_64 (Ceph)

Requires: python-requests

 Error: Package: ceph-common-0.80.7-0.el7.centos.x86_64 (Ceph)

Requires: librbd1 = 0.80.7-0.el7.centos

Available: librbd1-0.80.6-0.el7.centos.x86_64 (Ceph)

librbd1 =

Re: [ceph-users] Cascading Failure of OSDs

2015-04-09 Thread HEWLETT, Paul (Paul)** CTR **


I use the folowing:

cat /sys/class/net/em1/statistics/rx_bytes

for the em1 interface

all other stats are available

Paul Hewlett
Senior Systems Engineer
Velocix, Cambridge
Alcatel-Lucent
t: +44 1223 435893 m: +44 7985327353




From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Carl-Johan 
Schenström [carl-johan.schenst...@gu.se]
Sent: 09 April 2015 07:34
To: Francois Lafont; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Cascading Failure of OSDs

Francois Lafont wrote:

 Just in case it could be useful, I have noticed the -s option (on my
 Ubuntu) that offer an output probably easier to parse:

 # column -t is just to make it's nice for the human eyes.
 ifconfig -s | column -t

Since ifconfig is deprecated, one should use iproute2 instead.

ip -s link show p2p1 | awk '/(RX|TX):/{getline; print $3;}'

However, the sysfs interface is probably a better alternative. See 
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-class-net-statistics
 and https://www.kernel.org/doc/Documentation/ABI/README.

--
Carl-Johan Schenström
Driftansvarig / System Administrator
Språkbanken  Svensk nationell datatjänst /
The Swedish Language Bank  Swedish National Data Service
Göteborgs universitet / University of Gothenburg
carl-johan.schenst...@gu.se / +46 709 116769
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS unmatched rstat after upgrade hammer

2015-04-09 Thread John Spray




On 09/04/2015 17:09, Scottix wrote:

Alright sounds good.

Only one comment then:
From an IT/ops perspective all I see is ERR and that raises red flags. 
So the exposure of the message might need some tweaking. In production 
I like to be notified of an issue but have reassurance it was fixed 
within the system.


Fair point.  Unfortunately, in general we can't distinguish between 
inconsistencies we're fixing up due to a known software bug, and 
inconsistencies that we're encountering for unknown reasons.  The reason 
this is an error rather than a warning is that we handle this case by 
arbitrarily trusting one statistic when it disagrees with another, so we 
don't *know* that we've correctly repaired, we just hope.


Anyway: the solution is the forthcoming scrub functionality, which will 
be able to unambiguously repair things like this, and give you a clearer 
statement about what happened.


Cheers,
John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

48 matches

Mail list logo