Re: [gentoo-user] Re: eix-remote update
On 01/19/2015 05:07 PM, Alec Ten Harmsel wrote: Just ran it; it downloads http://gpo.zugaina.org/eix_cache/eix-cache.tbz2. It terminated with an error while writing the database file, something related to overlays not existing or something. The help page (run eix-remote with no command) tells me that `eix-remote update` downloads an eix cache of a bunch of gentoo overlays. Unless you're interested in searching for software in the overlays without having all the overlay portage trees on your machine, Yes, that's the point. I would stick to running a plain `eix-sync`. You could always try removing /var/cache/eix/remote.eix and then run `eix-remote update2` - that one succeeds on my machine. eix-remote update2 succeeds on my system too. FYI, `eix-sync` never calls any `eix-remote` commands; `eix-sync` runs these three commands in this order: `emerge --sync`, `eix-update`, `eix-diff`. If this command is failing, I would recommend deleting all the files in /var/cache/eix. If I delete the files in /var/cache/eix and run eix-sync (without having run eix-remote update) it terminates successfully.
[gentoo-user] Re: btrfs fails to balance
Bill Kenworthy billk at iinet.net.au writes: The main thing keeping me away from CephFS is that it has no mechanism for resolving silent corruption. Btrfs underneath it would obviously help, though not for failure modes that involve CephFS itself. I'd feel a lot better if CephFS had some way of determining which copy was the right one other than the master server always wins. The Giant version 0.87 is a major release with many new fixes; it may have the features you need. Currently the ongoing releases are up to : v0.91. The readings look promissing, but I'll agree it needs to be tested with non-critical data. http://ceph.com/docs/master/release-notes/#v0-87-giant http://ceph.com/docs/master/release-notes/#notable-changes Forget ceph on btrfs for the moment - the COW kills it stone dead after real use. When running a small handful of VMs on a raid1 with ceph - slw :) I'm staying away from VMs. It's spark on top of mesos I'm after. Maybe docker or another container solution, down the road. I read where some are using a SSD with raid 1 and bcache to speed up performance and stability a bit. I do not want to add SSD to the mix right now, as the (3) node development systems all have 32 G of ram. You can turn off COW and go single on btrfs to speed it up but bugs in ceph and btrfs lose data real fast! Interesting idea, since I'll have raid1 underneath each node. I'll need to dig into this idea a bit more. ceph itself (my last setup trashed itself 6 months ago and I've given up!) will only work under real use/heavy loads with lots of discrete systems, ideally 10G network, and small disks to spread the failure domain. Using 3 hosts and 2x2g disks per host wasn't near big enough :( Its design means that small scale trials just wont work. Huh. My systems are FX8350 (8)processors running at 4GHz with 32 G ram. Water coolers will allow me to crank up the speed (when/if needed) to 5 or 6 GHz. Not intel but low end either. Its not designed for small scale/low end hardware, no matter how attractive the idea is :( Supposedly there are tool to measure/monitor ceph better now. That is one of the things I need to research. How to manage the small cluster better and back off the throughput/load while monitoring performance on a variety of different tasks. Definitely not a production usage. I certainly appreciate your ceph_experiences. I filed a but with the version request for Giant v0.87. Did your run the version ? What versions did you experiment with? I hope to set up Anisble to facilitate rapid installations of a variety of gentoo systems used for cluster or ceph testing. That way configurations should be able to reboot after bad failures. Did your experienced failures with Ceph require the gentoo-btrfs based systems to be complete reinstalled from scratch, or just purge the disk of Ceph and reconfigure Ceph? I'm hoping to configure ceph in such a way that failures do not corrupt the gentoo-btrfs installation and only require repair to ceph; so your comments on that strategy are most welcome. BillK James
Re: [gentoo-user] Re: btrfs fails to balance
On Tue, Jan 20, 2015 at 10:07 AM, James wirel...@tampabay.rr.com wrote: Bill Kenworthy billk at iinet.net.au writes: You can turn off COW and go single on btrfs to speed it up but bugs in ceph and btrfs lose data real fast! Interesting idea, since I'll have raid1 underneath each node. I'll need to dig into this idea a bit more. So, btrfs and ceph solve an overlapping set of problems in an overlapping set of ways. In general adding data security often comes at the cost of performance, and obviously adding it at multiple layers can come at the cost of additional performance. I think the right solution is going to depend on the circumstances. if ceph provided that protection against bitrot I'd probably avoid a COW filesystem entirely. It isn't going to add any additional value, and they do have a performance cost. If I had mirroring at the ceph level I'd probably just run them on ext4 on lvm with no mdadm/btrfs/whatever below that. Availability is already ensured by ceph - if you lose a drive then other nodes will pick up the load. If I didn't have robust mirroring at the ceph level then having mirroring of some kind at the individual node level would improve availability. On the other hand, ceph currently has some gaps, so having it on top of zfs/btrfs could provide protection against bitrot. However, right now there is no way to turn off COW while leaving checksumming enabled. It would be nice if you could leave the checksumming on. Then if there was bitrot btrfs would just return an error when you tried to read the file, and then ceph would handle it like any other disk error and use a mirrored copy on another node. The problem with ceph+ext4 is that if there is bitrot neither layer will detect it. Does btrfs+ceph really have a performance hit that is larger than btrfs without ceph? I fully expect it to be slower than ext4+ceph. Btrfs in general performs fairly poorly right now - that is expected to improve in the future, but I doubt that it will ever outperform ext4 other than for specific operations that benefit from it (like reflink copies). It will always be faster to just overwrite one block in the middle of a file than to write the block out to unallocated space and update all the metadata. -- Rich
Re: [gentoo-user] Re: Get off my lawn?
On Mon, Jan 19, 2015 at 1:03 PM, James wirel...@tampabay.rr.com wrote: I think the fundamental flaw with systemd is the fact that the duality of support for systemd and other init solutions is so quickly abondoned. If they were allowed (encouraged) to run side by side for a few years, let folks decide then; as it is a major abandonment of principal, imho. The problem is logind. Various apps defaulted to depending on it rather than on consolekit and logind isn't a standalone systemd executable. The Linux world could've been saved a lot aggravation if a differmt choice had been made... Lot of folks in the embedded linux world, are scratching their heads at systemd; the conclusion from most of what I read is no thanks anyway. Lennart claims that the embedded world loves systemd. I suspect that, as in other corners of the Linux world, there are lovers and haters of systemd.
[gentoo-user] Re: btrfs fails to balance
Rich Freeman rich0 at gentoo.org writes: You can turn off COW and go single on btrfs to speed it up but bugs in ceph and btrfs lose data real fast! So, btrfs and ceph solve an overlapping set of problems in an overlapping set of ways. In general adding data security often comes at the cost of performance, and obviously adding it at multiple layers can come at the cost of additional performance. I think the right solution is going to depend on the circumstances. Raid 1 with btrfs can not only protect the ceph fs files but the gentoo node installation itself. I'm not so worried about proformance, because my main (end result) goal is to throttle codes so they run almost exclusively in ram (in memory) as design by amplabs. Spark plus Tachyon is a work in progress, for sure. The DFS will be used in lieu of HDFS for distributed/cluster types of apps, hence ceph. Btrfs + raid 1 is as a failsafe for the node installations, but also all data. I only intend to write out data, once a job/run is finished; but granted that is very experimental right now and will evolve over time. if ceph provided that protection against bitrot I'd probably avoid a COW filesystem entirely. It isn't going to add any additional value, and they do have a performance cost. If I had mirroring at the ceph level I'd probably just run them on ext4 on lvm with no mdadm/btrfs/whatever below that. Availability is already ensured by ceph - if you lose a drive then other nodes will pick up the load. If I didn't have robust mirroring at the ceph level then having mirroring of some kind at the individual node level would improve availability. I've read that btrfs and ceph are a very, suitable, yet very immature match for local-distributed file system needs. On the other hand, ceph currently has some gaps, so having it on top of zfs/btrfs could provide protection against bitrot. However, right now there is no way to turn off COW while leaving checksumming enabled. It would be nice if you could leave the checksumming on. Then if there was bitrot btrfs would just return an error when you tried to read the file, and then ceph would handle it like any other disk error and use a mirrored copy on another node. The problem with ceph+ext4 is that if there is bitrot neither layer will detect it. Good points, hence a flexible configuration where ceph can be reconfigured and recovered as warranted, for this long term set of experiments. Does btrfs+ceph really have a performance hit that is larger than btrfs without ceph? I fully expect it to be slower than ext4+ceph. Btrfs in general performs fairly poorly right now - that is expected to improve in the future, but I doubt that it will ever outperform ext4 other than for specific operations that benefit from it (like reflink copies). It will always be faster to just overwrite one block in the middle of a file than to write the block out to unallocated space and update all the metadata. I fully expect the combination of btrfs+ceph to mature and become competitive. It's not critical data, but a long term experiment. Surely critical data will be backed up off the 3-node cluster. I hope to use ansible to enable recovery, configuration changes and bringing on and managing additional nodes; this a concept at the moment, but googling around it does seem to be a popular idea. As always your insight and advice is warmly received. James
Re: [gentoo-user] SOLVED: Slow startx - Why does hostname -f hang?
On Sun, Jan 4, 2015 at 1:55 PM, Urs Schütz u.sch...@bluewin.ch wrote: I changed /etc/hosts as Mick/Michael pointed out in an other reply, and this solved the slow response. Here the relevant part of the corrected, working /etc/hosts: # IPv4 and IPv6 localhost aliases 127.0.0.1cadd.homeLAN localhost ::1 cadd.homeLAN localhost cadd Thanks for your time looking into it and the hints. I've taken to using Debian's setup for dhcp systems 127.0.0.1 localhost 127.0.1.1 cadd.homeLAN cadd across all distros in order to have a clean 127.0.0.1 localhost line.
Re: [gentoo-user] Re: btrfs fails to balance
On Tue, Jan 20, 2015 at 12:27 PM, James wirel...@tampabay.rr.com wrote: Raid 1 with btrfs can not only protect the ceph fs files but the gentoo node installation itself. Agree 100%. Like I said, the right solution depends on your situation. If you're using the server doing ceph storage only for file serving, then protecting the OS installation isn't very important. Heck, you could just run the OS off of a USB stick. If you're running nodes that do a combination of application and storage, then obviously you need to worry about both, which probably means not relying on ceph as your sole source of protection. That applies to a lot of kitchen sink setups where hosts don't have a single role. -- Rich
[gentoo-user] SMART drive test results, 2.0 for same drive as before.
Howdy, This is concerning a hard drive I had issues with a while back. I been using it to do backups with as a test if nothing else. Anyway, it seems to have issues once again. root@fireball / # smartctl -l selftest /dev/sdd smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.16.3-gentoo] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 40% 21406 4032272464 # 2 Short offline Completed without error 00% 21387 - # 3 Short offline Completed without error 00% 21363 - # 4 Extended offlineCompleted: read failure 40% 21343 4032272464 # 5 Short offline Completed without error 00% 21315 - # 6 Short offline Completed without error 00% 21291 - # 7 Short offline Completed without error 00% 21267 - # 8 Short offline Completed without error 00% 21243 - # 9 Short offline Completed without error 00% 21219 - #10 Short offline Completed without error 00% 21195 - #11 Extended offlineCompleted: read failure 40% 21174 4032272464 #12 Short offline Completed without error 00% 21147 - #13 Short offline Completed without error 00% 21123 - #14 Short offline Completed without error 00% 21099 - #15 Short offline Completed without error 00% 21075 - #16 Short offline Completed without error 00% 21051 - #17 Short offline Completed without error 00% 21026 - #18 Extended offlineCompleted: read failure 40% 21005 4032267424 #19 Short offline Completed without error 00% 20978 - #20 Short offline Completed without error 00% 20954 - #21 Short offline Completed without error 00% 20930 - root@fireball / # More info: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 116 099 006Pre-fail Always - 114620384 3 Spin_Up_Time0x0003 092 092 000Pre-fail Always - 0 4 Start_Stop_Count0x0032 100 100 020Old_age Always - 39 5 Reallocated_Sector_Ct 0x0033 053 051 036Pre-fail Always - 62752 7 Seek_Error_Rate 0x000f 080 060 030Pre-fail Always - 102219639 9 Power_On_Hours 0x0032 076 076 000Old_age Always - 21403 10 Spin_Retry_Count0x0013 100 100 097Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020Old_age Always - 40 183 Runtime_Bad_Block 0x0032 100 100 000Old_age Always - 0 184 End-to-End_Error0x0032 100 100 099Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 100 100 000Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 068 063 045Old_age Always - 32 (Min/Max 23/36) 191 G-Sense_Error_Rate 0x0032 100 100 000Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000Old_age Always - 11 193 Load_Cycle_Count0x0032 001 001 000Old_age Always - 276725 194 Temperature_Celsius 0x0022 032 040 000Old_age Always - 32 (0 17 0 0 0) 197 Current_Pending_Sector 0x0012 088 088 000Old_age Always - 1984 198 Offline_Uncorrectable 0x0010 088 088 000Old_age Offline - 1984 199 UDMA_CRC_Error_Count0x003e 200 200 000Old_age Always - 0 240 Head_Flying_Hours 0x 100 253 000Old_age Offline - 18810h+14m+31.520s 241 Total_LBAs_Written 0x 100 253 000Old_age Offline - 110684232213092 242 Total_LBAs_Read 0x 100 253 000Old_age Offline - 92603114597547 I thought I would check this thing manually just to be nosy. When I saw the errors, I then got curious as to why I hadn't received a email on this issue. Well, long story short, I changed the password for google mail and forgot to change it for that Smtp thing or whatever. Fixed that now. Anyway, no clue how long this issue has been going on but
Re: [gentoo-user] Re: Get off my lawn?
Zitat von Tom H tomh0...@gmail.com: Lennart claims that the embedded world loves systemd. I suspect that, as in other corners of the Linux world, there are lovers and haters of systemd. Embedded systems also quite often means low on resources, CPU power, memory, space. If you are using hard space constrained systems, the sheer size of systemd in the file system can be a valid reason not to use it at all. So it does depend on the type of embedded system you are looking at.
Re: [gentoo-user] Re: btrfs fails to balance
On 21/01/15 00:03, Rich Freeman wrote: On Tue, Jan 20, 2015 at 10:07 AM, James wirel...@tampabay.rr.com wrote: Bill Kenworthy billk at iinet.net.au writes: You can turn off COW and go single on btrfs to speed it up but bugs in ceph and btrfs lose data real fast! Interesting idea, since I'll have raid1 underneath each node. I'll need to dig into this idea a bit more. So, btrfs and ceph solve an overlapping set of problems in an overlapping set of ways. In general adding data security often comes at the cost of performance, and obviously adding it at multiple layers can come at the cost of additional performance. I think the right solution is going to depend on the circumstances. if ceph provided that protection against bitrot I'd probably avoid a COW filesystem entirely. It isn't going to add any additional value, and they do have a performance cost. If I had mirroring at the ceph level I'd probably just run them on ext4 on lvm with no mdadm/btrfs/whatever below that. Availability is already ensured by ceph - if you lose a drive then other nodes will pick up the load. If I didn't have robust mirroring at the ceph level then having mirroring of some kind at the individual node level would improve availability. On the other hand, ceph currently has some gaps, so having it on top of zfs/btrfs could provide protection against bitrot. However, right now there is no way to turn off COW while leaving checksumming enabled. It would be nice if you could leave the checksumming on. Then if there was bitrot btrfs would just return an error when you tried to read the file, and then ceph would handle it like any other disk error and use a mirrored copy on another node. The problem with ceph+ext4 is that if there is bitrot neither layer will detect it. Does btrfs+ceph really have a performance hit that is larger than btrfs without ceph? I fully expect it to be slower than ext4+ceph. Btrfs in general performs fairly poorly right now - that is expected to improve in the future, but I doubt that it will ever outperform ext4 other than for specific operations that benefit from it (like reflink copies). It will always be faster to just overwrite one block in the middle of a file than to write the block out to unallocated space and update all the metadata. answer to both you and James here: I think it was pre 8.0 when I dropped out. Its Ceph that suffers from bitrot - I use the golden master approach to generating the VM's so corruption was obvious. I did report one bug in the early days that turned out to be btrfs, but I think it was largely ceph which has been born out by consolidating the ceph trial hardware and using it with btrfs and the same storage - rare problems and I can point to hardware/power when it happened. The performance hit was not due to lack of horsepower (cpu, ram etc) but due to I/O - both network bandwidth and internal bus on the hosts. That is why a small number of systems no matter how powerful wont work well. For real performance, I saw people using SSD's and large numbers of hosts in order to distribute the data flows - this does work and I saw some insane numbers posted. It also requires multiple networks (internal and external) to separate the flows (not VLAN but dedicated pipes) due to the extreem burstiness of the traffic. As well as VM images, I had backups (using dirvish) and thousands of security camera images. Deletes of a directory with a lot of files would take many hours. Same with using ceph for a mail store (came up on the ceph list under why is it so slow) - as a chunk server its just not suitable for lots of small files. Towards the end of my use, I stopped seeing bitrot on a system with data but idle to limiting it to occurring during heavy use. My overall conclusion is lots of small hosts with no more than a couple of drives each and multiple networks with lots of bandwidth is what its designed for. I had two reasons for looking at ceph - distributed storage where data in use was held close to the user but could be redistributed easily with multiple copies (think two small data stores with an intermittent WAN link storing high and low priority data) and high performance with high availability on HW failure. Ceph was not the answer for me with the scale I have. BillK
Re: [gentoo-user] Re: Get off my lawn?
On Mon, 19 Jan 2015 18:03:44 + (UTC) James wrote: Interestingly, Bircoph has solve many of the problems that seem to be in my path of discovery. If you have any questions about particular issues, we may discuss them. Out of my memory for all setups we use nothing really special — standard Gentoo software, some custom scripts (for sync and/or HA) — and one really beatiful solution we wrote: clsync. In short this is lsyncd replacement in C which is much faster and have much more functionality (at least for our needs). Right now this software is not in tree, but can be found in my dev overlay. New clsync version was recently released and I plan to push it to tree after some testing. Best regards, Andrew Savchenko pgpaUDKvZUjwG.pgp Description: PGP signature
[gentoo-user] Software to keep track of stocks
I've tried to setup some stocks in GnuCash but it does not list TSX What alternatives are to keep track of stocks under Linux. -- Joseph
Re: [gentoo-user] Re: Get off my lawn?
On Tue, Jan 20, 2015 at 2:58 PM, Marc Stürmer m...@marc-stuermer.de wrote: Zitat von Tom H tomh0...@gmail.com: Lennart claims that the embedded world loves systemd. I suspect that, as in other corners of the Linux world, there are lovers and haters of systemd. Embedded systems also quite often means low on resources, CPU power, memory, space. If you are using hard space constrained systems, the sheer size of systemd in the file system can be a valid reason not to use it at all. So it does depend on the type of embedded system you are looking at. True. I've actually started comparing the direction systemd is moving in with busybox. The latter is of course already popular in embedded environments for the reasons you state. If you really want something super-minimal busybox is probably more of what you're looking for. On the other hand, if you want something more functional but still generally integrated then systemd might be the right solution. RAM use for systemd (plus its deps) seems to be on the order of maybe 2MB or so depending on what features you have resident (journal/etc). Most systemd utilities do not run continuously. Some of the shared memory use for systemd deps may be consumed already depending on what else is running on the system. Many systemd components would not necessarily need to be installed on-disk for an embedded system. For example, command-line utilities used by administrators to control their system might not need to be installed for systemd to still function (you don't need to manually change the runlevel of an embedded device, start/stop modules, etc - and all these tasks can be controlled over dbus without using the binaries on disk so your embedded application can still manage things). I'm not sure how systemd works with glibc alternatives, etc. If you can dispense with a shell entirely by moving to systemd then there could actually be some savings on that end, and performance will certainly be better. This page seems to be a fairly neutral/factual exploration of this issue: https://people.debian.org/~stapelberg/docs/systemd-dependencies.html This isn't really intended as a systemd is the right tool for every embedded solution or systemd is a horrible tool for embedded argument. It just is a tool and in the embedded world you should weigh its pros/cons as with anything else. Most likely an embedded environment is going to be highly-tailored in any case, so you'll be wanting to seriously consider your options for every component. If your embedded device is more like a phone with (relatively larger) gobs of RAM then systemd may be advantageous simply for its ubiquity. -- Rich
Re: [gentoo-user] SMART drive test results, 2.0 for same drive as before.
On 01/20/2015 09:58 AM, Dale wrote: 1 Raw_Read_Error_Rate 0x000f 116 099 006Pre-fail Always - 114620384 5 Reallocated_Sector_Ct 0x0033 053 051 036Pre-fail Always - 62752 197 Current_Pending_Sector 0x0012 088 088 000Old_age Always - 1984 198 Offline_Uncorrectable 0x0010 088 088 000Old_age Offline - 1984 Anyway, no clue how long this issue has been going on but there it is again. When I google, some places say this is somewhat normal. Some say it needs to be watched and some say the world is coming to a end and we are all going to die a horrible death. Me, I'm thinking this drive came out the south end of a north bound something bad, skunk maybe. Basically, it stinks and I'm not real happy about it. :-@ Based on those 4 I quoted from your original post I wouldn't use the drive anymore. It's indicating it can't correct some of the bad sectors and they're still visible to the OS (Current_Pending_Sector). This is very bad. It's also reallocated some sectors and they are not visible to the OS anymore. It's reallocated 62752 sectors. Since this is the 2nd time for this specific drive, thoughts? Recycle it. By the way, I'm doing a dd to erase the drive just for giggles. Since it ain't blowing smoke, I may use it as a backup still, just to play with, until I can get another drive. I think that moved up the priority list a bit now. Don't rely on this drive for backups. Dan