Re: Major HDD performance degradation on btrfs receive

2016-03-15 Thread Nazar Mokrynskyi

Sounds like a really good idea!

I'll try to implement in in my backup tool, but it might take some time 
to see real benefit from it (or no benefit:)).


Sincerely, Nazar Mokrynskyi
github.com/nazar-pc
Skype: nazar-pc
Diaspora:naza...@diaspora.mokrynskyi.com
Tox: 
A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249

On 16.03.16 06:18, Chris Murphy wrote:

Very simplistically: visualizing Btrfs writes without file deletion,
it's a contiguous write. There isn't much scatter, even accounting for
metadata and data chunk writes happening in slightly different regions
of platter space. (I'm thinking this slow down happens overwhelmingly
on HDDs.)

If there are file deletions, holes appear, and now some later writes
will fill those holes, but not exactly, which will lead to
fragmentation and thus seek times. Seeks would go up by a lot the
smaller the holes are. And the holes are smaller the fewer files are
being deleted at once.

If there's a snapshot, and then file deletions, holes don't appear.
Everything is always copy on write and deleted files don't actually
get deleted (they're still in another subvolume). So as soon as a file
is reflinked or in a snapshotted subvolume there's no fragmentation
happening with file deletions.

If there's many snapshots happening in a short time, such as once
every 10 minutes, that means only 10 minutes worth of writes happening
in a given subvolume. If that space is later released by deleting
snapshots one at time (like a rolling snapshot and delete strategy
every 10 minutes) that means only small holes are opening up for later
writes. It's maybe the worst case scenario for fragmenting Btrfs.

A better way might be to delay snapshot deletion. Keep taking the
snapshots, but delete old snapshots in batches. Delete maybe 10 or 100
(if we're talking thousands of snapshots) at once. This should free a
lot more contiguous space for later writes and significantly reduce
the chance of significant fragmentation. Of course some fragmentation
is going to happen no matter what, but I think the usage pattern
described in a lot of these slow down cases sound to me like worse
case scenario for cow.

Now, a less lazy person would actually test this hypothesis.


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message tomajord...@vger.kernel.org
More majordomo info athttp://vger.kernel.org/majordomo-info.html





smime.p7s
Description: Кріптографічний підпис S/MIME


Re: Major HDD performance degradation on btrfs receive

2016-03-15 Thread Chris Murphy
Maybe a starting point.
https://oss.oracle.com/~mason/seekwatcher/

This shows write patterns, not current state fragmentation. So it's
less useful for what you're asking, and more useful for what I was
suggesting as a batched snapshot delete strategy.


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-03-15 Thread Chris Murphy
Very simplistically: visualizing Btrfs writes without file deletion,
it's a contiguous write. There isn't much scatter, even accounting for
metadata and data chunk writes happening in slightly different regions
of platter space. (I'm thinking this slow down happens overwhelmingly
on HDDs.)

If there are file deletions, holes appear, and now some later writes
will fill those holes, but not exactly, which will lead to
fragmentation and thus seek times. Seeks would go up by a lot the
smaller the holes are. And the holes are smaller the fewer files are
being deleted at once.

If there's a snapshot, and then file deletions, holes don't appear.
Everything is always copy on write and deleted files don't actually
get deleted (they're still in another subvolume). So as soon as a file
is reflinked or in a snapshotted subvolume there's no fragmentation
happening with file deletions.

If there's many snapshots happening in a short time, such as once
every 10 minutes, that means only 10 minutes worth of writes happening
in a given subvolume. If that space is later released by deleting
snapshots one at time (like a rolling snapshot and delete strategy
every 10 minutes) that means only small holes are opening up for later
writes. It's maybe the worst case scenario for fragmenting Btrfs.

A better way might be to delay snapshot deletion. Keep taking the
snapshots, but delete old snapshots in batches. Delete maybe 10 or 100
(if we're talking thousands of snapshots) at once. This should free a
lot more contiguous space for later writes and significantly reduce
the chance of significant fragmentation. Of course some fragmentation
is going to happen no matter what, but I think the usage pattern
described in a lot of these slow down cases sound to me like worse
case scenario for cow.

Now, a less lazy person would actually test this hypothesis.


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: btrfs_assert fix coding style

2016-03-15 Thread Anand Jain


 Ignore. NACK.

Thanks, Anand


On 03/16/2016 08:55 AM, Anand Jain wrote:

Signed-off-by: Anand Jain 
---
  fs/btrfs/ctree.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index c9fb7b9ca8a4..5b5002a242e4 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4283,8 +4283,8 @@ static inline void assfail(char *expr, char *file, int 
line)
  #define ASSERT(expr)  ((void)0)
  #endif

-#define btrfs_assert()
-__printf(5, 6)
+#define btrfs_assert() __printf(5, 6)
+
  __cold
  void __btrfs_std_error(struct btrfs_fs_info *fs_info, const char *function,
 unsigned int line, int errno, const char *fmt, ...);


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


bad metadata [125501440, 125517824) crossing stripe boundary

2016-03-15 Thread Nazar Mokrynskyi

I was running btrfsck today and got many of such errors:


bad metadata [125501440, 125517824) crossing stripe boundary
bad metadata [131334144, 131350528) crossing stripe boundary
bad metadata [142999552, 143015936) crossing stripe boundary
bad metadata [153944064, 153960448) crossing stripe boundary
bad metadata [281870336, 281886720) crossing stripe boundary
bad metadata [528285696, 528302080) crossing stripe boundary
bad metadata [661323776, 661340160) crossing stripe boundary
bad metadata [986316800, 986333184) crossing stripe boundary
bad metadata [987168768, 987185152) crossing stripe boundary
bad metadata [1029111808, 1029128192) crossing stripe boundary
bad metadata [1099169792, 1099186176) crossing stripe boundary
I was able to find message with similar error on mainling list from 
January, but didn't found any answer.


I'm on Kernel 4.5.0 stable and btrfs-tools 4.4. filesystem was created 
at the beginning of July 2015.


Here is btrfs scrub output:


scrub status for 40b8240a-a0a2-4034-ae55-f8558c0343a8
scrub started at Wed Mar 16 04:13:54 2016 and finished after 
00:52:51

total bytes scrubbed: 274.05GiB with 0 errors
Looks like no metadata errors found, so what that "bad metadata" things 
really mean?


--
Sincerely, Nazar Mokrynskyi
github.com/nazar-pc
Skype: nazar-pc
Diaspora: naza...@diaspora.mokrynskyi.com
Tox: 
A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249




smime.p7s
Description: Кріптографічний підпис S/MIME


Re: Major HDD performance degradation on btrfs receive

2016-03-15 Thread Nazar Mokrynskyi

It could also be that the disk is bit older and has or is starting to
use its spare sectors.
I do not really think HDD is that old. I've got it brand new less than 
year ago. Here is smartctl output:



nazar-pc@nazar-pc ~> sudo smartctl -a /dev/sda
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.5.0-haswell] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, 
www.smartmontools.org


=== START OF INFORMATION SECTION ===
Model Family: Seagate Samsung SpinPoint M9T
Device Model: ST2000LM003 HN-M201RAD
Serial Number:S34RJ9CF727799
LU WWN Device Id: 5 0004cf 20dbc7ec5
Firmware Version: 2BC10004
User Capacity:2 000 398 934 016 bytes [2,00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:5400 rpm
Form Factor:  2.5 inches
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:Wed Mar 16 01:25:17 2016 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)Offline data collection 
activity

was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:  (   0)The previous self-test 
routine completed

without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (23760) seconds.
Offline data collection
capabilities:  (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003)Saves SMART data before 
entering

power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01)Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:  (   1) minutes.
Extended self-test routine
recommended polling time:  ( 396) minutes.
SCT capabilities:(0x003f)SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x002f   100   100   051 Pre-fail  
Always   -   16
  2 Throughput_Performance  0x0026   252   252   000 Old_age   
Always   -   0
  3 Spin_Up_Time0x0023   088   086   025 Pre-fail  
Always   -   3760
  4 Start_Stop_Count0x0032   100   100   000 Old_age   
Always   -   840
  5 Reallocated_Sector_Ct   0x0033   252   252   010 Pre-fail  
Always   -   0
  7 Seek_Error_Rate 0x002e   252   252   051 Old_age   
Always   -   0
  8 Seek_Time_Performance   0x0024   252   252   015 Old_age   
Offline  -   0
  9 Power_On_Hours  0x0032   100   100   000 Old_age   
Always   -   6208
 10 Spin_Retry_Count0x0032   252   252   051 Old_age   
Always   -   0
 12 Power_Cycle_Count   0x0032   100   100   000 Old_age   
Always   -   678
191 G-Sense_Error_Rate  0x0022   100   100   000 Old_age   
Always   -   11
192 Power-Off_Retract_Count 0x0022   252   252   000 Old_age   
Always   -   0
194 Temperature_Celsius 0x0002   053   044   000 Old_age   
Always   -   47 (Min/Max 17/56)
195 Hardware_ECC_Recovered  0x003a   100   100   000 Old_age   
Always   -   0
196 Reallocated_Event_Count 0x0032   252   252   000 Old_age   
Always   -   0
197 Current_Pending_Sector  0x0032   252   252   000 Old_age   
Always   -   0
198 Offline_Uncorrectable   0x0030   252   252   000 Old_age   
Offline  -   0
199 UDMA_CRC_Error_Count0x0036   200   200   000 Old_age   
Always   -   0
200 Multi_Zone_Error_Rate   0x002a   100   100   000 Old_age   
Always   -   20
223 Load_Retry_Count0x0032   100   100   000 Old_age   
Always   -   7
225 Load_Cycle_Count0x0032   100   100   000 Old_age   
Always   -   7035


SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log 

Why we always balance system chunk alone with metadata?

2016-03-15 Thread Qu Wenruo

Hi,

During debugging a bug related to balancing metadata chunk, we found 
that if we specify -m option for "btrfs balance", it will always balance 
system chunk too.


cmds-balance.c:
---
/*
 * allow -s only under --force, otherwise do with system chunks
 * the same thing we were ordered to do with meta chunks
 */
if (args.flags & BTRFS_BALANCE_SYSTEM) {
if (!force) {
error(
"Refusing to explicitly operate on system 
chunks.\n"

"Pass --force if you really want to do that.");
return 1;
}
} else if (args.flags & BTRFS_BALANCE_METADATA) {
args.flags |= BTRFS_BALANCE_SYSTEM; <<< Here
memcpy(, ,
sizeof(struct btrfs_balance_args));
}
---

I'm curious why we always bind system chunk to metadata balance?

Is there any special reason?
The patch introducing such behavior is dated back to 2012, and this 
makes us unable to do metadata *only* balance.


Any idea?

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2 v2] btrfs-progs: Fix a regression that "property" with -t option doesn't work

2016-03-15 Thread Satoru Takeuchi
Sorry, I forgot to add "btrfs-progs: " in the subject line.
---
"property" is considered as working without any options
from the following commit.

commit 176aeca9a148 ("btrfs-progs: add getopt stubs where needed")

However, we can pass -t option to this command.

* actual result

  ==
  $ ./btrfs prop list -t f /btrfs
  btrfs property list: invalid option -- 't'
  usage: btrfs property list [-t ] 

  Lists available properties with their descriptions for the given object.

  Please see the help of 'btrfs property get' for a description of
  objects and object types.

  ==

* expected result

  ==
  $ ./btrfs prop list -t f /btrfs
  label   Set/get label of device.
  ==

Signed-off-by: Satoru Takeuchi 

[PATCH 2/2 v2] fix a regression that "property" with -t option doesn't work

2016-03-15 Thread Satoru Takeuchi
"property" is considered as working without any options
from the following commit.

commit 176aeca9a148 ("btrfs-progs: add getopt stubs where needed")

However, we can pass -t option to this command.

* actual result

  ==
  $ ./btrfs prop list -t f /btrfs
  btrfs property list: invalid option -- 't'
  usage: btrfs property list [-t ] 

  Lists available properties with their descriptions for the given object.

  Please see the help of 'btrfs property get' for a description of
  objects and object types.

  ==

* expected result

  ==
  $ ./btrfs prop list -t f /btrfs
  label   Set/get label of device.
  ==

Signed-off-by: Satoru Takeuchi 

[PATCH 1/2] btrfs-progs: Describe optarg of -m option in the manpage of receive

2016-03-15 Thread Satoru Takeuchi
Signed-off-by: Satoru Takeuchi 
---
This patch can be applied to devel branch (commit: 4685a560811a)
---
 Documentation/btrfs-receive.asciidoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/btrfs-receive.asciidoc 
b/Documentation/btrfs-receive.asciidoc
index 84b85c1..758eebe 100644
--- a/Documentation/btrfs-receive.asciidoc
+++ b/Documentation/btrfs-receive.asciidoc
@@ -43,7 +43,7 @@ or on EOF.
 --max-errors ::
 Terminate as soon as N errors happened while processing commands from the send
 stream. Default value is 1. A value of 0 means no limit.
--m::
+-m ::
 The root mount point of the destination fs.
 +
 By default the mountpoint is searched in /proc/self/mounts.
-- 
2.5.0
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: btrfs_assert fix coding style

2016-03-15 Thread Anand Jain
Signed-off-by: Anand Jain 
---
 fs/btrfs/ctree.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index c9fb7b9ca8a4..5b5002a242e4 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4283,8 +4283,8 @@ static inline void assfail(char *expr, char *file, int 
line)
 #define ASSERT(expr)   ((void)0)
 #endif
 
-#define btrfs_assert()
-__printf(5, 6)
+#define btrfs_assert() __printf(5, 6)
+
 __cold
 void __btrfs_std_error(struct btrfs_fs_info *fs_info, const char *function,
 unsigned int line, int errno, const char *fmt, ...);
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Incompat features: raid56 ... when creating a RAID6?

2016-03-15 Thread Anand Jain




One more question: Any idea what happened here? Did I send this garbage? This
was at the very end of your response mail...


 Yes looks like. As it was there when I read your posting.


AndiN‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±nÚß²)í…æèw*jg¬±¨¶‰šŽŠÝ¢j/êäz
¹Þ–Šà2ŠÞ™¨è­Ú&¢)ß¡«a¶Úþø®G«éh®æj:+v‰¨Šwè†Ù¥


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] btrfs: block incompatible optional features at scan

2016-03-15 Thread Anand Jain
For the matter of completeness we need to check if the device
being scanned has features that are known to the kernel. As of
now if it doesn't - the mount will fails, then what is the point
in having those devices added to the btrfs_fs_devices list at
device_list_add().

So block those devices at scan. Which means the original block at
open_ctee() won't reach in case of device with unsupported feature.
But I am leaving that code as it is, without deleting.

Unit testing:
Create progs with the following changes and mkfs.
--
diff --git a/ctree.h b/ctree.h
index 5ab0f4a45a15..a8d86facc045 100644
--- a/ctree.h
+++ b/ctree.h
@@ -480,6 +480,7 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_RAID56  (1ULL << 7)
 #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL << 8)
 #define BTRFS_FEATURE_INCOMPAT_NO_HOLES(1ULL << 9)
+#define BTRFS_FEATURE_INCOMPAT_TEST(1ULL << 10)

 #define BTRFS_FEATURE_COMPAT_SUPP  0ULL

@@ -495,7 +496,8 @@ struct btrfs_super_block {
 BTRFS_FEATURE_INCOMPAT_RAID56 |\
 BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS |  \
 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA |   \
-BTRFS_FEATURE_INCOMPAT_NO_HOLES)
+BTRFS_FEATURE_INCOMPAT_NO_HOLES |  \
+BTRFS_FEATURE_INCOMPAT_TEST)

 /*
  * A leaf is full of items. offset and size tell us where to find
diff --git a/mkfs.c b/mkfs.c
index ea584042db16..f3665a93364b 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1368,7 +1368,7 @@ int main(int ac, char **av)
int dev_cnt = 0;
int saved_optind;
char fs_uuid[BTRFS_UUID_UNPARSED_SIZE] = { 0 };
-   u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+   u64 features = BTRFS_MKFS_DEFAULT_FEATURES | 
BTRFS_FEATURE_INCOMPAT_TEST;
struct mkfs_allocation allocation = { 0 };
struct btrfs_mkfs_config mkfs_cfg;


Results..

btrfs dev scan /dev/sdc
Scanning for Btrfs filesystems in '/dev/sdc'
ERROR: device scan failed '/dev/sdc' - Protocol family not supported

btrfs dev ready /dev/sdc
ERROR: unable to determine if device '/dev/sdc' is ready for mount: Protocol 
family not supported

mount /dev/sdc /btrfs
mount: mount /dev/sdc on /btrfs failed: Protocol family not supported

Signed-off-by: Anand Jain 
---

v2: Commit update with unit test case and its results.
Remove the printk the error to system log, thats not required.
Return -EPFNOSUPPORT instead of -ENOSUPPORT, thats more appropriate
when device with incompatible feature is found during device scan.

 fs/btrfs/volumes.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6f7873e401ac..8ca3b0d3f1ef 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1071,6 +1071,7 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
u64 transid;
u64 total_devices;
u64 bytenr;
+   u64 features;
 
/*
 * we would like to check all the supers, but that would make
@@ -1091,6 +1092,12 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
if (btrfs_read_disk_super(bdev, bytenr, , _super))
goto error_bdev_put;
 
+   features = btrfs_super_incompat_flags(disk_super) &
+   ~BTRFS_FEATURE_INCOMPAT_SUPP;
+   if (features) {
+   ret = -EPFNOSUPPORT;
+   goto error_disk_super;
+   }
devid = btrfs_stack_device_id(_super->dev_item);
transid = btrfs_super_generation(disk_super);
total_devices = btrfs_super_num_devices(disk_super);
@@ -1105,6 +1112,7 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
if (!ret && fs_devices_ret)
(*fs_devices_ret)->total_devices = total_devices;
 
+error_disk_super:
btrfs_release_disk_super(page);
 
 error_bdev_put:
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Hit return to continue]

2016-03-15 Thread Anand Jain


 Uh! Where that title came from. Sorry about the title.
 I am resending.

-Anand

On 03/16/2016 07:13 AM, Anand Jain wrote:

For the matter of completeness we need to check if the device
being scanned has features that are known to the kernel. As of
now if it doesn't - the mount will fails, then what is the point
in having those devices added to the btrfs_fs_devices list at
device_list_add().

So block those devices at scan. Which means the original block at
open_ctee() won't reach in case of device with unsupported feature.
But I am leaving that code as it is, without deleting.

Unit testing:
Create progs with the following changes and mkfs.
--
diff --git a/ctree.h b/ctree.h
index 5ab0f4a45a15..a8d86facc045 100644
--- a/ctree.h
+++ b/ctree.h
@@ -480,6 +480,7 @@ struct btrfs_super_block {
  #define BTRFS_FEATURE_INCOMPAT_RAID56  (1ULL << 7)
  #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL << 8)
  #define BTRFS_FEATURE_INCOMPAT_NO_HOLES(1ULL << 9)
+#define BTRFS_FEATURE_INCOMPAT_TEST(1ULL << 10)

  #define BTRFS_FEATURE_COMPAT_SUPP  0ULL

@@ -495,7 +496,8 @@ struct btrfs_super_block {
  BTRFS_FEATURE_INCOMPAT_RAID56 |\
  BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS |  \
  BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA |   \
-BTRFS_FEATURE_INCOMPAT_NO_HOLES)
+BTRFS_FEATURE_INCOMPAT_NO_HOLES |  \
+BTRFS_FEATURE_INCOMPAT_TEST)

  /*
   * A leaf is full of items. offset and size tell us where to find
diff --git a/mkfs.c b/mkfs.c
index ea584042db16..f3665a93364b 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1368,7 +1368,7 @@ int main(int ac, char **av)
 int dev_cnt = 0;
 int saved_optind;
 char fs_uuid[BTRFS_UUID_UNPARSED_SIZE] = { 0 };
-   u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+   u64 features = BTRFS_MKFS_DEFAULT_FEATURES | 
BTRFS_FEATURE_INCOMPAT_TEST;
 struct mkfs_allocation allocation = { 0 };
 struct btrfs_mkfs_config mkfs_cfg;


Results..

btrfs dev scan /dev/sdc
Scanning for Btrfs filesystems in '/dev/sdc'
ERROR: device scan failed '/dev/sdc' - Protocol family not supported

btrfs dev ready /dev/sdc
ERROR: unable to determine if device '/dev/sdc' is ready for mount: Protocol 
family not supported

mount /dev/sdc /btrfs
mount: mount /dev/sdc on /btrfs failed: Protocol family not supported

Signed-off-by: Anand Jain 
---
  fs/btrfs/volumes.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6f7873e401ac..8ca3b0d3f1ef 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1071,6 +1071,7 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
u64 transid;
u64 total_devices;
u64 bytenr;
+   u64 features;

/*
 * we would like to check all the supers, but that would make
@@ -1091,6 +1092,12 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
if (btrfs_read_disk_super(bdev, bytenr, , _super))
goto error_bdev_put;

+   features = btrfs_super_incompat_flags(disk_super) &
+   ~BTRFS_FEATURE_INCOMPAT_SUPP;
+   if (features) {
+   ret = -EPFNOSUPPORT;
+   goto error_disk_super;
+   }
devid = btrfs_stack_device_id(_super->dev_item);
transid = btrfs_super_generation(disk_super);
total_devices = btrfs_super_num_devices(disk_super);
@@ -1105,6 +1112,7 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
if (!ret && fs_devices_ret)
(*fs_devices_ret)->total_devices = total_devices;

+error_disk_super:
btrfs_release_disk_super(page);

  error_bdev_put:


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] btrfs-progs: fix a reression that "property" with -t option doesn't work

2016-03-15 Thread Satoru Takeuchi

On 2016/03/14 21:23, David Sterba wrote:

On Mon, Mar 14, 2016 at 09:12:36AM +0900, Satoru Takeuchi wrote:

--- a/cmds-property.c
+++ b/cmds-property.c
@@ -379,9 +379,7 @@ static int cmd_property_get(int argc, char **argv)
char *name = NULL;
int types = 0;

-   clean_args_no_options(argc, argv, cmd_property_get_usage);
-
-   if (check_argc_min(argc, 2) || check_argc_max(argc, 5))
+   if (check_argc_min(argc, 2))
usage(cmd_property_get_usage);

parse_args(argc, argv, cmd_property_get_usage, , , ,


We still need to check the number of non-option arguments here, when the
optind is set from parse_args.


OK, I'll send a patch which checks the number after getopt.

Thanks,
Satoru




@@ -415,9 +413,7 @@ static int cmd_property_set(int argc, char **argv)
-   if (check_argc_min(argc, 4) || check_argc_max(argc, 6))
+   if (check_argc_min(argc, 4))
usage(cmd_property_set_usage);


...


parse_args(argc, argv, cmd_property_set_usage, ,
@@ -446,9 +442,7 @@ static int cmd_property_list(int argc, char **argv)
-   if (check_argc_min(argc, 2) || check_argc_max(argc, 4))
+   if (check_argc_min(argc, 2))


...


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 01/20] btrfs: dedup: Introduce dedup framework and its header

2016-03-15 Thread Duncan
Nicholas D Steeves posted on Tue, 15 Mar 2016 18:08:41 -0400 as excerpted:

> I'm not sure to what degree the following is a relevant concern, and I'm
> guessing it's not, other than for laughs, but to me "dedupe" reads as
> "de-dupe" or "undupe".  While it functions as the inverse of the verb
> "to dupe", I don't think one can "be unduped" or "be unfooled". What is
> that old aphorism?  "Once duped twice shy"? ;-)

That's the obvious association, yes, and the negative connotations of 
dupe are surely why I have such a personal negative reaction to dedupe.  
But precedent and current usage being what they are...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Hit return to continue]

2016-03-15 Thread Anand Jain
For the matter of completeness we need to check if the device
being scanned has features that are known to the kernel. As of
now if it doesn't - the mount will fails, then what is the point
in having those devices added to the btrfs_fs_devices list at
device_list_add().

So block those devices at scan. Which means the original block at
open_ctee() won't reach in case of device with unsupported feature.
But I am leaving that code as it is, without deleting.

Unit testing:
Create progs with the following changes and mkfs.
--
diff --git a/ctree.h b/ctree.h
index 5ab0f4a45a15..a8d86facc045 100644
--- a/ctree.h
+++ b/ctree.h
@@ -480,6 +480,7 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_RAID56  (1ULL << 7)
 #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL << 8)
 #define BTRFS_FEATURE_INCOMPAT_NO_HOLES(1ULL << 9)
+#define BTRFS_FEATURE_INCOMPAT_TEST(1ULL << 10)

 #define BTRFS_FEATURE_COMPAT_SUPP  0ULL

@@ -495,7 +496,8 @@ struct btrfs_super_block {
 BTRFS_FEATURE_INCOMPAT_RAID56 |\
 BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS |  \
 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA |   \
-BTRFS_FEATURE_INCOMPAT_NO_HOLES)
+BTRFS_FEATURE_INCOMPAT_NO_HOLES |  \
+BTRFS_FEATURE_INCOMPAT_TEST)

 /*
  * A leaf is full of items. offset and size tell us where to find
diff --git a/mkfs.c b/mkfs.c
index ea584042db16..f3665a93364b 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1368,7 +1368,7 @@ int main(int ac, char **av)
int dev_cnt = 0;
int saved_optind;
char fs_uuid[BTRFS_UUID_UNPARSED_SIZE] = { 0 };
-   u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+   u64 features = BTRFS_MKFS_DEFAULT_FEATURES | 
BTRFS_FEATURE_INCOMPAT_TEST;
struct mkfs_allocation allocation = { 0 };
struct btrfs_mkfs_config mkfs_cfg;


Results..

btrfs dev scan /dev/sdc
Scanning for Btrfs filesystems in '/dev/sdc'
ERROR: device scan failed '/dev/sdc' - Protocol family not supported

btrfs dev ready /dev/sdc
ERROR: unable to determine if device '/dev/sdc' is ready for mount: Protocol 
family not supported

mount /dev/sdc /btrfs
mount: mount /dev/sdc on /btrfs failed: Protocol family not supported

Signed-off-by: Anand Jain 
---
 fs/btrfs/volumes.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6f7873e401ac..8ca3b0d3f1ef 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1071,6 +1071,7 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
u64 transid;
u64 total_devices;
u64 bytenr;
+   u64 features;
 
/*
 * we would like to check all the supers, but that would make
@@ -1091,6 +1092,12 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
if (btrfs_read_disk_super(bdev, bytenr, , _super))
goto error_bdev_put;
 
+   features = btrfs_super_incompat_flags(disk_super) &
+   ~BTRFS_FEATURE_INCOMPAT_SUPP;
+   if (features) {
+   ret = -EPFNOSUPPORT;
+   goto error_disk_super;
+   }
devid = btrfs_stack_device_id(_super->dev_item);
transid = btrfs_super_generation(disk_super);
total_devices = btrfs_super_num_devices(disk_super);
@@ -1105,6 +1112,7 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
if (!ret && fs_devices_ret)
(*fs_devices_ret)->total_devices = total_devices;
 
+error_disk_super:
btrfs_release_disk_super(page);
 
 error_bdev_put:
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-03-15 Thread Henk Slager
On Tue, Mar 15, 2016 at 1:47 AM, Nazar Mokrynskyi  wrote:
> Some update since last time (few weeks ago).
>
> All filesystems are mounted with noatime, I've also added mounting
> optimization - so there is no problem with remounting filesystem every time,
> it is done only once.
>
> Remounting optimization helped by reducing 1 complete snapshot +
> send/receive cycle by some seconds, but otherwise it is still very slow when
> `btrfs receive` is active.

OK, great that the umount+mount is gone. I think most time is
unfortunately spent in seeks; I think over time and due to various
factors, both free space and files are highly fragmented on your disk.
It could also be that the disk is bit older and has or is starting to
use its spare sectors.

> I'm not considering bcache + btrfs as potential setup because I do not
> currently have free SSD for it and basically spending SSD besides HDD for
> backup partition feels like a bit of overkill (especially for desktop use).

Yes I think so too; For backup, I am also a bit reluctant to use
bcache. But the big difference is that you do a snapshot transfer
every 15minute while I do that only every 24hour. So I almost dont
care how long the send|receive takes in the middle of the night. I
also almost never look at the backups, and when I do, indeed scanning
through a 1000 snapshots fs on spinning disk takes time. If a script
does that every 15mins, and the fs uses LZO compression and there is
another active partition then you will have to deal with the slowness.
And if the files are mostly small, like source-trees, it gets even
worse. So it is about 100x more creates+deletes of subvolumes. To be
honest, it is just requiring too much from a HDD I think, knowing that
btrfs is CoW. On a fresh fs it might work OK in the beginning, but
over time...

You could adapt the script or backup method not to search every time,
but to just write the next diff send|receive and only step back and
search if this fails.

Or keeping more 15min snapshots only on SSD and lower the rate of
send|receive them to HDD

Another thing you could do is skip the receive step; So just pipe the
15min snapshot diff to a stream file and just leave it on the backup
HDD until you need files from the backup. Only then do a series of
incremental receives of the streams. An every now and then a full
(non-incremental) send.

> My current kernel is 4.5.0 stable, btrfs-tools still 4.4-1 from Ubuntu 16.04
> repository as of today.
>
> As I'm reading mailing list there are other folks having similar performance
> issues. So can we debug things to find the root cause and fix it at some
> point?

Indeed there are multiple reports with similar symptoms. I think it is
not really that one should see it as an error or root cause or some
fault. It is further optimization and then specifically for harddisks.
Or implementing additional concepts just for harddisks. For (parity)
RAID (by btrfs itself, not DM or MD etc), one can exploit parallelism,
but it is not trivial to get that fully optimized for all device
configurations and tasks.

> My C/C++/Kernel/BTRFS knowledges are scarce, which is why some assistance
> here is needed from someone more experienced.

It is all about HDD seek times in the first place. There are many
thoughts, articles and benchmarks about this over the years on the
internet, but I just found this one from last year about XFS:
https://lkml.org/lkml/2015/4/29/776
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Snapshots slowing system

2016-03-15 Thread Peter Chant
On 03/15/2016 03:52 PM, Duncan wrote:



> Meanwhile, FWIW, some months ago I finally got tired of having to specify 
> noatime on all my mounts, expanding my fstab width by 8 chars (including 
> the ,) and the total fstab character count by several multiples of that 
> as I added it to all entries, and decided to see if I might per chance, 
> even as a sysadmin not a dev, be able to come up with a patch that 
> changed the kernel default to noatime.  It wasn't actually hard, tho were 
> I a coder and actually knew what I was doing, I imagine I could create a 
> much better patch.  So now all my filesystems (barring a few of the 
> memory-only virtual-filesystem mounts) are mounted noatime by default, as 
> opposed to the unpatched relatime, and I was able to take all the noatimes 
> out of my fstab. =:^)

It is a pity you cannot use variables or macros in fstab.  Its not too
bad with traditional file systems on my home user machine but with
multiple subvolumes my fstab is huge and there is a lot of repetition
of the options.



>> Hmm.  A bit tight.  I've just ordered a replacement SSD.
> 
> While ~8 GiB unallocated on a ~118 GiB filesystem is indeed a bit tight, 
> it's nothing that should be giving btrfs fits yet.
> 

Too late, drive ordered.  It was only a matter of time anyway.


> Tho even with autodefrag, given the previous relatime and snapshotting, 
> it could be that the free-space in existing chunks is fragmented, which 
> over time and continued usage would force higher file fragmentation 
> despite the autodefrag, since there simply aren't any large contiguous 
> free-space areas left in which to write files.
>

Hmm. The following returns instantly as if it were a null operation.
btrfs fi defrag /

I thought though that btrfs fi defrag  would only defrag the one
file or directory?

btrfs fi defrag /srv/photos/
Is considerably slower, it is still running.  Disk light is on solid.
Processes kworker and btrfs-transacti are pretty busy according to iotop.




> But either way, given the LZO compression it appears I've used under half 
> the 8 GiB capacity.  Meanwhile, du -xBM / says 4158M, so just over half 
> in uncompressed data  (with --apparent-size added it says 3624M).
> 

I seem to install a lot of interesting looking things I barely use.  I
am surprised about how full the filesystem gets, it should not.
However, large disks make life much easier rather than routing out
unused packages as a hobby.  Unless it gets silly.



> 
> Boot is an exception to the usual btrfs raid1, with a separate working 
> boot partition on one device and its backup on the other, so I can point 
> the BIOS at and boot either one.  It's btrfs mixed-bg mode dup, 256 MiB 
> for each of working and backup, which because it's dup means 128 MiB 
> capacity.  That's actually a bit small, and why I'll be shrinking the log 
> partition the next time I repartition.  Making it 384 MiB dup, for 192 
> MiB capacity, would be much better, and since I can shrink the log 
> partition by that and still keep the main partitions GiB aligned, it all 
> works out.
> 

Slackware uses lilo so I need a separate /boot with something that is
supported by lilo.



> If I had 500 GiB SSDs like the one you're getting, I could put the media 
> partition on SSDs and be rid of the spinning rust entirely.  But I seem 
> to keep finding higher priorities for the money I'd spend on a pair of 
> them...


I'm getting one, not two, so the system is raid0.  Data is more
important (and backed up).

> 


> Good point.  Similar here except the backup/maintenance isn't a cutdown 
> system, it's a snapshot (in time, not btrfs snapshot) of exactly what was 
> on the system when I did the backup.  That way, should it be necessary, I 
> can boot the backup and have a fully functional system exactly as it was 
> the day I took that backup.  That's very nice to have for a maintenance 
> setup, since it means I have access to full manpages, even a full X, 
> media players, a full graphical browser to google my problems with, etc.
> 
I have that as well.  But the non-btrfs maintenance partition is there
in case btrfs is unbootable.

-- 
Peter Chant
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Incompat features: raid56 ... when creating a RAID6?

2016-03-15 Thread Anand Jain



On 03/16/2016 05:42 AM, Andreas Grosse wrote:

Hello everyone!

I just wanted to create a RAID6 and got the following output:


# mkfs.btrfs -d raid6 -m raid6 -L slowPool /dev/sd[cdefgh]
btrfs-progs v4.4.1
See http://btrfs.wiki.kernel.org for more information.

Label:  slowPool
UUID:   85ddf6a7-51f1-4cc0-b6cd-d472277b0e86
Node size:  16384
Sector size:4096
Filesystem size:27.29TiB

Block group profiles:
   Data: RAID6 4.01GiB
   Metadata: RAID6 4.01GiB
   System:   RAID614.50MiB

SSD detected:   no
Incompat features:  extref, raid56, skinny-metadata
Number of devices:  6

Devices:
IDSIZE  PATH

 1 4.55TiB  /dev/sdc
 2 4.55TiB  /dev/sdd
 3 4.55TiB  /dev/sde
 4 4.55TiB  /dev/sdf
 5 4.55TiB  /dev/sdg
 6 4.55TiB  /dev/sdh


And then the line saying "Incompat features: ... raid56" came to my eyes.
Reading the corresponding manpage, it says:

raid56
 extended format for RAID5/6, also enabled if raid5 or raid6 block groups
 are selected


So why is raid56 marked as incompatible if I just created a file system with
multiple disks using the RAID6 profile? Have I misunderstood something there?
I am confused. Can somebody here lighten this up?


 Those messages are indeed confusing. It just indicates the
 FS may fail to mount on certain older kernels. Not necessarily
 the kernel in the system on which you ran btrfs-progs. Sorry
 that it not very obvious at the moment, but there are patches
 to make this part better.

 To check features that your running kernel supports you could use
  ls /sys/fs/btrfs/features

 But note: Some of the names used by sys/fs/..features don't
 exactly match with the names used by btrfs-progs: incompatible..
 (a bug, which is also fixed in the patch).

Thanks, Anand



(If important: Gentoo Linux with kernel 4.5.0, btrfs-progs v4.4.1)

AndiN‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±nÚß²)í…æèw*jg¬±¨¶‰šŽŠÝ¢j/�êäz¹Þ–Šà2ŠÞ™¨è­Ú&¢)ß¡«a¶Úþø®G«�éh®æj:+v‰¨Šwè†Ù¥


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 01/20] btrfs: dedup: Introduce dedup framework and its header

2016-03-15 Thread Nicholas D Steeves
On 13 March 2016 at 12:55, Duncan <1i5t5.dun...@cox.net> wrote:
> NeilBrown posted on Sun, 13 Mar 2016 22:33:22 +1100 as excerpted:
>
>> On Sun, Mar 13 2016, Qu Wenruo wrote:
>>
>>> BTW, I am always interested in, why de-duplication can be shorted as
>>> 'dedupe'.
>
>>> I didn't see any 'e' in the whole word "DUPlication".
>>> Or it's an abbreviation of "DUPlicatE" instead of "DUPlication"?
>>
>> The "u" in "duplicate" is pronounced as a long vowel sound, almost like
>> d-you-plicate.
>
>> To make a vowel long you can add an 'e' at the end of a word.
>
>> by analogy, "dupe" has a long "u" and so sounds like the first syllable
>> of "duplicate".
>
> As a native (USian but with some years growing up in the then recently
> independent former Crown colony of Kenya, influencing my personal
> preferences) English speaker, while what Neil says about short "u" vs.
> long "u" is correct, I agree with Qu that the "e" in dupe doesn't make so
> much sense, and would, other things being equal, vastly prefer dedup to
> dedupe, myself.
>
> However, there's some value in consistency, and given the previous dedupe
> precedent in-kernel, sticking to that for consistency reasons makes sense.
>
> But were this debate to have been about the original usage, I'd have
> definitely favored dedup all the way, as not withstanding Neil's argument
> above, adding the "e" makes little sense to me either.  So only because
> it's already in use in kernel code, but if this /were/ the original
> kernel code...
>
> So I definitely understand your confusion, Qu, and have the same personal
> preference even as a native English speaker. =:^)

I'm not sure to what degree the following is a relevant concern, and
I'm guessing it's not, other than for laughs, but to me "dedupe" reads
as "de-dupe" or "undupe".  While it functions as the inverse of the
verb "to dupe", I don't think one can "be unduped" or "be unfooled".
What is that old aphorism?  "Once duped twice shy"? ;-)

Honestly I'm surprised that a verb-form of "tuple" hasn't yet emerged,
because if it had we might be saying "detup" instead of "dedup".

Best regards,
Nicholas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Incompat features: raid56 ... when creating a RAID6?

2016-03-15 Thread Hugo Mills
On Tue, Mar 15, 2016 at 10:42:29PM +0100, Andreas Grosse wrote:
> Hello everyone!
> 
> I just wanted to create a RAID6 and got the following output:
> 
> > # mkfs.btrfs -d raid6 -m raid6 -L slowPool /dev/sd[cdefgh]
[snip]
> > Incompat features:  extref, raid56, skinny-metadata
[snip]
> And then the line saying "Incompat features: ... raid56" came to my eyes. 
> Reading the corresponding manpage, it says:
> > raid56
> > extended format for RAID5/6, also enabled if raid5 or raid6 block groups
> > are selected
> So why is raid56 marked as incompatible if I just created a file system with 
> multiple disks using the RAID6 profile? Have I misunderstood something there? 
> I am confused. Can somebody here lighten this up?

   It's a safety thing.

   The incompat flags are markers set in the filesystem to indicate
which features that particular FS uses. Each kernel version has a list
of features it can handle, and if it's asked to mount a filesystem
with a feature that it doesn't recognise, it'll refuse to do so.

   So, you've created a filesystem with the RAID5/6 feature, it's
marked as such in the FS (with the incompat flag "raid56"), and
attempting to mount that FS on a kernel that doesn't know about parity
RAID (earlier than 3.14, IIRC) will fail safely because the kernel
can't handle it.

   Hugo.

-- 
Hugo Mills | Alert status upwards vermilion: High probability of
hugo@... carfax.org.uk | flash photography. Avoid wearing brogues.
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Incompat features: raid56 ... when creating a RAID6?

2016-03-15 Thread Andreas Grosse
Hello everyone!

I just wanted to create a RAID6 and got the following output:

> # mkfs.btrfs -d raid6 -m raid6 -L slowPool /dev/sd[cdefgh]
> btrfs-progs v4.4.1
> See http://btrfs.wiki.kernel.org for more information.
> 
> Label:  slowPool
> UUID:   85ddf6a7-51f1-4cc0-b6cd-d472277b0e86
> Node size:  16384
> Sector size:4096
> Filesystem size:27.29TiB
> 
> Block group profiles:
>   Data: RAID6 4.01GiB
>   Metadata: RAID6 4.01GiB
>   System:   RAID614.50MiB
> 
> SSD detected:   no
> Incompat features:  extref, raid56, skinny-metadata
> Number of devices:  6
> 
> Devices:
>IDSIZE  PATH
>
> 1 4.55TiB  /dev/sdc
> 2 4.55TiB  /dev/sdd
> 3 4.55TiB  /dev/sde
> 4 4.55TiB  /dev/sdf
> 5 4.55TiB  /dev/sdg
> 6 4.55TiB  /dev/sdh

And then the line saying "Incompat features: ... raid56" came to my eyes. 
Reading the corresponding manpage, it says:
> raid56
> extended format for RAID5/6, also enabled if raid5 or raid6 block groups
> are selected

So why is raid56 marked as incompatible if I just created a file system with 
multiple disks using the RAID6 profile? Have I misunderstood something there? 
I am confused. Can somebody here lighten this up?

(If important: Gentoo Linux with kernel 4.5.0, btrfs-progs v4.4.1)

Andi

[PATCH v2 12/12] block: test fallocate for block devices

2016-03-15 Thread Darrick J. Wong
Now that we're wiring up fallocate's PUNCH_HOLE and ZERO_RANGE
features for block devices, add some tests to make sure they
work correctly.

v2: Update tests to reflect EOD clamping suggested by Linus.

Signed-off-by: Darrick J. Wong 
---
 common/scsi_debug |6 ++
 tests/generic/705 |   77 
 tests/generic/705.out |   11 +
 tests/generic/706 |   74 +++
 tests/generic/706.out |   10 
 tests/generic/707 |  118 +
 tests/generic/707.out |   32 +
 tests/generic/group   |3 +
 8 files changed, 330 insertions(+), 1 deletion(-)
 create mode 100755 tests/generic/705
 create mode 100644 tests/generic/705.out
 create mode 100755 tests/generic/706
 create mode 100644 tests/generic/706.out
 create mode 100755 tests/generic/707
 create mode 100644 tests/generic/707.out

diff --git a/common/scsi_debug b/common/scsi_debug
index eb08126..74c3802 100644
--- a/common/scsi_debug
+++ b/common/scsi_debug
@@ -40,13 +40,17 @@ _get_scsi_debug_dev()
logical=${2-512}
unaligned=${3-0}
size=${4-128}
+   test -n "$4" && shift
+   test -n "$3" && shift
+   test -n "$2" && shift
+   test -n "$1" && shift
 
phys_exp=0
while [ $logical -lt $physical ]; do
let physical=physical/2
let phys_exp=phys_exp+1
done
-   opts="sector_size=$logical physblk_exp=$phys_exp 
lowest_aligned=$unaligned dev_size_mb=$size"
+   opts="sector_size=$logical physblk_exp=$phys_exp 
lowest_aligned=$unaligned dev_size_mb=$size $@"
echo "scsi_debug options $opts" >> $seqres.full
modprobe scsi_debug $opts
[ $? -eq 0 ] || _fail "scsi_debug modprobe failed"
diff --git a/tests/generic/705 b/tests/generic/705
new file mode 100755
index 000..4bb8752
--- /dev/null
+++ b/tests/generic/705
@@ -0,0 +1,77 @@
+#! /bin/bash
+# FS QA Test No. 705
+#
+# Test fallocate(ZERO_RANGE) on a block device, which should be able to
+# WRITE SAME (or equivalent) the range.
+#
+#---
+# Copyright (c) 2016 Oracle, Inc.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 7 15
+
+_cleanup()
+{
+cd /
+rm -rf $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/scsi_debug
+
+# real QA test starts here
+_supported_os Linux
+_require_scsi_debug
+_require_xfs_io_command "fzero"
+
+echo "Create and format"
+dev=$(_get_scsi_debug_dev 512 512 0 4 "lbpws=1 lbpws10=1")
+_pwrite_byte 0x62 0 4m $dev >> $seqres.full
+
+echo "Zero range"
+$XFS_IO_PROG -c "fzero -k 512k 1m" $dev
+
+echo "Zero range without keep_size"
+$XFS_IO_PROG -c "fzero 384k 64k" $dev
+
+echo "Zero range past EOD"
+$XFS_IO_PROG -c "fzero -k 3m 4m" $dev
+
+echo "Check contents"
+md5sum $dev | sed -e "s|$dev|SCSI_DEBUG_DEV|g"
+
+echo "Zero range to EOD"
+$XFS_IO_PROG -c "fzero -k 0 9223372036854775807" $dev
+
+echo "Check contents"
+md5sum $dev | sed -e "s|$dev|SCSI_DEBUG_DEV|g"
+
+echo "Destroy device"
+_put_scsi_debug_dev
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/705.out b/tests/generic/705.out
new file mode 100644
index 000..ccbda23
--- /dev/null
+++ b/tests/generic/705.out
@@ -0,0 +1,11 @@
+QA output created by 705
+Create and format
+Zero range
+Zero range without keep_size
+Zero range past EOD
+Check contents
+f0cb9070c098aa347f664bead3a219d9  SCSI_DEBUG_DEV
+Zero range to EOD
+Check contents
+b5cfa9d6c8febd618f91ac2843d50a1c  SCSI_DEBUG_DEV
+Destroy device
diff --git a/tests/generic/706 b/tests/generic/706
new file mode 100755
index 000..184dbc2
--- /dev/null
+++ b/tests/generic/706
@@ -0,0 +1,74 @@
+#! /bin/bash
+# FS QA Test No. 706
+#
+# Test fallocate(PUNCH_HOLE) on a block device, which should be able to
+# zero-TRIM (or equivalent) the range.
+#
+#---
+# Copyright (c) 2016 Oracle, Inc.  All Rights Reserved.
+#
+# This program is free software; 

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Henk Slager
On Tue, Mar 15, 2016 at 2:42 PM, Marc Haber  wrote:
> On Tue, Mar 15, 2016 at 02:29:32PM +0100, Marc Haber wrote:
>> After umounting and btrfs check the block device, things seem to be
>> fine now
>
> But, umounting the btrfs seemed to trigger the following kernel traces:
>
> Mar 15 14:21:30 fan kernel: [92308.377104] [ cut here 
> ]
> Mar 15 14:21:30 fan kernel: [92308.377135] WARNING: CPU: 5 PID: 28243 at 
> fs/btrfs/extent-tree.c:5380 bt
> rfs_free_block_groups+0x1bc/0x36f [btrfs]()
> Mar 15 14:21:30 fan kernel: [92308.377137] Modules linked in: vhost_net vhost 
> macvtap macvlan tun iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
> nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp dummy ebtable_filter 
> ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables 
> cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative bridge 
> stp llc snd_cmipci snd_hda_codec_realtek snd_hda_codec_generic 
> snd_hda_codec_hdmi kvm_amd snd_mpu401_uart snd_opl3_lib snd_rawmidi kvm 
> snd_hda_intel snd_seq_device snd_hda_codec snd_hda_core snd_hwdep 
> amd64_edac_mod snd_pcm_oss edac_mce_amd irqbypass input_leds snd_mixer_oss 
> pcspkr k10temp edac_core snd_pcm snd_timer snd i2c_piix4 asus_atk0110 
> soundcore acpi_cpufreq tpm_tis tpm sg processor evdev shpchp hwmon_vid 
> autofs4 crc32c_generic btrfs xor raid6_pq ext4 crc16 mbcache jbd2 hmac 
> sha256_ssse3 sha256_generic drbg ansi_cprng xts gf128mul algif_skcipher 
> af_alg dm_crypt dm_mod hid_generic usbhid hid usb_storage sr_mod sd_mod cdrom 
> ohci_pci r8169 mii amdkfd radeon i2c_algo_bit ahci ttm sym53c8xx libahci 
> xhci_pci scsi_transport_spi drm_kms_helper ohci_hcd ehci_pci xhci_hcd libata 
> ehci_hcd drm usbcore scsi_mod usb_common i2c_core button
> Mar 15 14:21:30 fan kernel: [92308.377203] CPU: 5 PID: 28243 Comm: umount Not 
> tainted 4.4.5-zgws1 #2
> Mar 15 14:21:30 fan kernel: [92308.377205] Hardware name: System manufacturer 
> System Product Name/M5A88-V EVO, BIOS 160310/12/2012
> Mar 15 14:21:30 fan kernel: [92308.377207]  005b 811dd418 
>  0009
> Mar 15 14:21:30 fan kernel: [92308.377210]  81051e21 a047a147 
> 880600a28000 
> Mar 15 14:21:30 fan kernel: [92308.377212]  880600a28080 8805af7eea00 
> a047a147 880600a28000
> Mar 15 14:21:30 fan kernel: [92308.377215] Call Trace:
> Mar 15 14:21:30 fan kernel: [92308.377221]  [] ? 
> dump_stack+0x5a/0x6f
> Mar 15 14:21:30 fan kernel: [92308.377224]  [] ? 
> warn_slowpath_common+0x8e/0xa3
> Mar 15 14:21:30 fan kernel: [92308.377239]  [] ? 
> btrfs_free_block_groups+0x1bc/0x36f[btrfs]
> Mar 15 14:21:30 fan kernel: [92308.377252]  [] ? 
> btrfs_free_block_groups+0x1bc/0x36f[btrfs]
> Mar 15 14:21:30 fan kernel: [92308.377267]  [] ? 
> close_ctree+0x1e6/0x2f2 [btrfs]
> Mar 15 14:21:30 fan kernel: [92308.377271]  [] ? 
> generic_shutdown_super+0x64/0xdf
> Mar 15 14:21:30 fan kernel: [92308.377273]  [] ? 
> kill_anon_super+0x9/0xe
> Mar 15 14:21:30 fan kernel: [92308.377285]  [] ? 
> btrfs_kill_super+0xd/0x16 [btrfs]
> Mar 15 14:21:30 fan kernel: [92308.377288]  [] ? 
> deactivate_locked_super+0x2f/0x56
> Mar 15 14:21:30 fan kernel: [92308.377291]  [] ? 
> cleanup_mnt+0x4f/0x6b
> Mar 15 14:21:30 fan kernel: [92308.377293]  [] ? 
> task_work_run+0x5d/0x71
> Mar 15 14:21:30 fan kernel: [92308.377296]  [] ? 
> prepare_exit_to_usermode+0x70/0x99
> Mar 15 14:21:30 fan kernel: [92308.377300]  [] ? 
> int_ret_from_sys_call+0x25/0x8f
> Mar 15 14:21:30 fan kernel: [92308.377302] ---[ end trace 18c6bb90b0c6c689 
> ]---
>
> Mar 15 14:21:30 fan kernel: [92308.377303] [ cut here 
> ]
> Mar 15 14:21:30 fan kernel: [92308.377318] WARNING: CPU: 5 PID: 28243 at 
> fs/btrfs/extent-tree.c:5381 btrfs_free_block_groups+0x1d7/0x36f [btrfs]()
> Mar 15 14:21:30 fan kernel: [92308.377319] Modules linked in: vhost_net vhost 
> macvtap macvlan tun iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
> nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp dummy ebtable_filter 
> ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables 
> cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative bridge 
> stp llc snd_cmipci snd_hda_codec_realtek snd_hda_codec_generic 
> snd_hda_codec_hdmi kvm_amd snd_mpu401_uart snd_opl3_lib snd_rawmidi kvm 
> snd_hda_intel snd_seq_device snd_hda_codec snd_hda_core snd_hwdep 
> amd64_edac_mod snd_pcm_oss edac_mce_amd irqbypass input_leds snd_mixer_oss 
> pcspkr k10temp edac_core snd_pcm snd_timer snd i2c_piix4 asus_atk0110 
> soundcore acpi_cpufreq tpm_tis tpm sg processor evdev shpchp hwmon_vid 
> autofs4 crc32c_generic btrfs xor raid6_pq ext4 crc16 mbcache jbd2 hmac 
> sha256_ssse3 sha256_generic drbg ansi_cprng xts gf128mul algif_skcipher 
> af_alg dm_crypt dm_mod hid_generic usbhid hid usb_storage sr_mod sd_mod cdrom 
> ohci_pci r8169 mii amdkfd 

btrfs error

2016-03-15 Thread Paul Harrison
Hi all,

I'm new to btrfs, and have just taken over management of this system; can
anyone point me in the right direction with regard to the following error:

BTRFS error (device sda3): unable to find ref byte nr 402100224 parent 0
root 256 owner 1 offset
btrfs: Transaction aborted (error -2)
BTRFS error (device sda3) in _btrfs_free_extent:5696: errno=-2 No such entry
BTRFS error (device sda3) in btrfs_run_delayed_refs:2681: errno=-2 No such
entry
BTRFS error (device sda3) in cleanup_transaction:1530: errno=-2 No such
entry
iTCO_wdt: probe of iTCO_wdt failed with error -16

The filesystem is now read-only, can I recover from this?

I am running SUSE Linux Enterprise 11 SP4 64-bit.

Thanks in advance,

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Snapshots slowing system

2016-03-15 Thread Duncan
pete posted on Mon, 14 Mar 2016 23:03:52 + as excerpted:

> [Duncan wrote...]

>>pete posted on Sat, 12 Mar 2016 13:01:17 + as excerpted:
>>> 
>>> Subvolumes are mounted with the following options:
>>> autodefrag,relatime,compress=lzo,subvol=>
> 
>>That relatime (which is the default), could be an issue.  See below.
> 
> I've now changed that to noatime.  I think I read or missread relatime
> as a good comprimise sometime in the past.

Well, "good" is relative (ha! much like relatime itself! =:^).

Relatime is certainly better than strictatime as it cuts down on atime 
updates quite a bit, and as a default it's a reasonable compromise (at 
least for most filesystems), because it /does/ do a pretty good job of 
eliminating /most/ atime updates while still doing the minimal amount to 
avoid breaking all known apps that still rely on what is mostly a legacy 
POSIX feature that very little actually modern software actually relies 
on any more.

For normal filesystems and normal use-cases, relatime really is a 
reasonably "good" compromise.  But btrfs is definitely not a traditional 
filesystem, relying as it does on COW, and snapshotting is even more 
definitely not a traditional filesystem feature.  Relatime does still 
work, but it's just not particularly suitable to frequent snapshotting.

Meanwhile, so little actually depends on atime these days, that unless 
you're trying to work out a compromise solution for a kernel with a 
standing rule that breaking working userspace is simply not acceptable, 
the context in which relatime was developed and for which it really is a 
good compromise, chances are pretty high that unless you are running 
something like mutt that is /known/ to need atime, you can simply set 
noatime and forget about it.

And I'm sure, were the kernel rules on avoiding breaking old but 
otherwise still working userspace somewhat less strict, noatime would be 
the kernel default now, as well.


Meanwhile, FWIW, some months ago I finally got tired of having to specify 
noatime on all my mounts, expanding my fstab width by 8 chars (including 
the ,) and the total fstab character count by several multiples of that 
as I added it to all entries, and decided to see if I might per chance, 
even as a sysadmin not a dev, be able to come up with a patch that 
changed the kernel default to noatime.  It wasn't actually hard, tho were 
I a coder and actually knew what I was doing, I imagine I could create a 
much better patch.  So now all my filesystems (barring a few of the 
memory-only virtual-filesystem mounts) are mounted noatime by default, as 
opposed to the unpatched relatime, and I was able to take all the noatimes 
out of my fstab. =:^)

>>Normally when posting, either btrfs fi df *and* btrfs fi show are
>>needed, /or/ (with a new enough btrfs-progs) btrfs fi usage.  And of
>>course the kernel (4.0.4 in your case) and btrfs-progs (not posted, that
>>I saw) versions.
> 
> OK, I have usage.  For the SSD with the system:
> 
> root@phoenix:~# btrfs fi usage /
> Overall:
> Device size:   118.05GiB
> Device allocated:  110.06GiB
> Device unallocated:  7.99GiB
> Used:  103.46GiB
> Free (estimated):   11.85GiB  (min: 11.85GiB)
> Data ratio: 1.00
> Metadata ratio: 1.00
> Global reserve:512.00MiB  (used: 0.00B)
> 
> Data,single: Size:102.03GiB, Used:98.16GiB
>/dev/sda3   102.03GiB
> 
> Metadata,single: Size:8.00GiB, Used:5.30GiB
>/dev/sda3 8.00GiB
> 
> System,single: Size:32.00MiB, Used:16.00KiB
>/dev/sda332.00MiB
> 
> Unallocated:
>/dev/sda3 7.99GiB
> 
> 
> Hmm.  A bit tight.  I've just ordered a replacement SSD.

While ~8 GiB unallocated on a ~118 GiB filesystem is indeed a bit tight, 
it's nothing that should be giving btrfs fits yet.

Tho even with autodefrag, given the previous relatime and snapshotting, 
it could be that the free-space in existing chunks is fragmented, which 
over time and continued usage would force higher file fragmentation 
despite the autodefrag, since there simply aren't any large contiguous 
free-space areas left in which to write files.

> Slackware
> should it in about 5GB+ of disk space I've seen on a website?  Hmm. 
> Don't beleive that.  I'd allow at least 10GB and more if I want to add
> extra packages such as libreoffice.  If I have no snapshots it seems to
> get to 45GB with various extra packages installed and grows to 100ish
> with snapshotting probally owing to updates.

FWIW, here on gentoo and actually using separate partitions and btrfs,
/not/ btrfs subvolumes (because I don't want all my data eggs in the same 
filesystem basket, should that filesystem go bad)...

My / is 8 GiB (per device, btrfs raid1 both data and metadata on 
partitions from two ssds, so same stuff on each device) including all 
files installed by packages except some individual subdirs in /var/ which 
are 

Re: raid 10 recovery

2016-03-15 Thread Henk Slager
On Tue, Mar 15, 2016 at 3:40 PM, Marius Räsener  wrote:
> Hey List,
>
> I have a raid10 on a rockstor machine (centOS based 
> "4.3.3-1.el7.elrepo.x86_64“) where one drive threw SMART errors.
>
> So I tried to replace them.
>
> 1. mount in degraded worked
> 2. start replace with the new drive (btrfs replace start 4 /dev/sdf 
> /mnt2/pool_name) worked fine first too
> 3. shutdown (from shutdown command and then after some minutes power off 
> since it doesn’t happen anything anymore)
>
> 4. I can’t mount anymore and restart the procedure
>
> here some logs:
>
> [root@rockstor ~]# dmesg | tail
> [ 2039.773261] BTRFS info (device sde): btrfs: use no compression
> [ 2039.773265] BTRFS info (device sde): disk space caching is enabled
> [ 2039.773266] BTRFS: has skinny extents
> [ 2039.775221] BTRFS: failed to read chunk tree on sde
> [ 2039.785544] BTRFS: open_ctree failed
> [ 2039.892750] BTRFS info (device sde): btrfs: use no compression
> [ 2039.892758] BTRFS info (device sde): disk space caching is enabled
> [ 2039.892761] BTRFS: has skinny extents
> [ 2039.894223] BTRFS: failed to read chunk tree on sde
> [ 2039.909454] BTRFS: open_ctree failed
>
> ###
> ###
> ###
>
> [root@rockstor ~]# btrfs fi sh
> Label: 'rockstor_rockstor'  uuid: 43308bee-a17f-4b11-9bbf-e9bd3fcc7b6f
> Total devices 2 FS bytes used 1.44GiB
> devid1 size 110.88GiB used 3.02GiB path /dev/sda3
> devid2 size 1.82TiB used 0.00B path /dev/sdf
>
> warning, device 4 is missing
> warning devid 4 not found already
> Label: 'waldpark'  uuid: bc36f340-8b89-421e-9288-b573457ed7de
> Total devices 4 FS bytes used 1.38TiB
> devid1 size 1.82TiB used 708.01GiB path /dev/sdc
> devid2 size 1.82TiB used 708.01GiB path /dev/sdd
> devid3 size 1.82TiB used 707.54GiB path /dev/sde
> *** Some devices missing
>
> ###
> ###
> ###
>
> [root@rockstor ~]# btrfs-show-super /dev/sdc
> superblock: bytenr=65536, device=/dev/sdc
> -
> csum0x81ca0c6d [match]
> bytenr  65536
> flags   0x1
> ( WRITTEN )
> magic   _BHRfS_M [match]
> fsidbc36f340-8b89-421e-9288-b573457ed7de
> label   waldpark
> generation  949
> root1288110080
> sys_array_size  290
> chunk_root_generation   902
> root_level  1
> chunk_root  1519382429696
> chunk_root_level1
> log_root0
> log_root_transid0
> log_root_level  0
> total_bytes 8001595736064
> bytes_used  1517774807040
> sectorsize  4096
> nodesize16384
> leafsize16384
> stripesize  4096
> root_dir6
> num_devices 4
> compat_flags0x0
> compat_ro_flags 0x0
> incompat_flags  0x161
> ( MIXED_BACKREF |
>   BIG_METADATA |
>   EXTENDED_IREF |
>   SKINNY_METADATA )
> csum_type   0
> csum_size   4
> cache_generation949
> uuid_tree_generation949
> dev_item.uuid   e6ede03b-2432-4d75-ad85-399730ce0bc9
> dev_item.fsid   bc36f340-8b89-421e-9288-b573457ed7de [match]
> dev_item.type   0
> dev_item.total_bytes2000398934016
> dev_item.bytes_used 76021760
> dev_item.io_align   4096
> dev_item.io_width   4096
> dev_item.sector_size4096
> dev_item.devid  1
> dev_item.dev_group  0
> dev_item.seek_speed 0
> dev_item.bandwidth  0
> dev_item.generation 0
>
> ###
> ###
> ###
>
> [root@rockstor ~]# mount -o degraded /dev/sdc /mnt2/waldpark
> mount: wrong fs type, bad option, bad superblock on /dev/sdc,
>missing codepage or helper program, or other error
>
>In some cases useful info is found in syslog - try
>dmesg | tail or so.

Do the following mounts still work:
# mount -o degraded,recovery,ro /dev/sdc /mnt2/waldpark
# mount -o degraded,recovery /dev/sdc /mnt2/waldpark


> ###
> ###
> ###
>
> [root@rockstor ~]# btrfs check /dev/sdc
> warning, device 4 is missing
> warning devid 4 not found already
> Checking filesystem on /dev/sdc
> UUID: bc36f340-8b89-421e-9288-b573457ed7de
> checking extents
> checking free space cache
> checking fs roots
> checking csums
> checking root refs
> checking quota groups
> Ignoring qgroup relation key 258
> Ignoring qgroup relation key 965
> Ignoring qgroup relation key 966
> Ignoring qgroup relation key 967
> 

Re: [PULL] Btrfs cleanups for 4.6, part 2

2016-03-15 Thread David Sterba
On Tue, Mar 15, 2016 at 07:24:48AM -0700, Chris Mason wrote:
> On Tue, Mar 15, 2016 at 02:50:14PM +0100, David Sterba wrote:
> > Hi,
> > 
> > a few more cleanups sent recently and some that I found in my inbox marked 
> > but
> > not processed. Based on top of current integration. Please pull, thanks.
> 
> Thanks Dave, I'll get these pulled in and restart my long stress run.

I've picked a few more patches to my for-next but they're not reviewed
nor tested. You may want to add them to the long test. In my k.org git in
branch misc-4.6.

Alex Lyakas (2):
  btrfs: csum_tree_block: return proper errno value
  btrfs: do not write corrupted metadata blocks to disk

Jiri Kosina (2):
  btrfs: cleaner_kthread() doesn't need explicit freeze
  btrfs: transaction_kthread() is not freezable

Liu Bo (2):
  Btrfs: make mapping->writeback_index point to the last written page
  Btrfs: cleanup error handling in extent_write_cached_pages
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


raid 10 recovery

2016-03-15 Thread Marius Räsener
Hey List,

I have a raid10 on a rockstor machine (centOS based 
"4.3.3-1.el7.elrepo.x86_64“) where one drive threw SMART errors.

So I tried to replace them.

1. mount in degraded worked
2. start replace with the new drive (btrfs replace start 4 /dev/sdf 
/mnt2/pool_name) worked fine first too
3. shutdown (from shutdown command and then after some minutes power off since 
it doesn’t happen anything anymore)

4. I can’t mount anymore and restart the procedure

here some logs:

[root@rockstor ~]# dmesg | tail
[ 2039.773261] BTRFS info (device sde): btrfs: use no compression
[ 2039.773265] BTRFS info (device sde): disk space caching is enabled
[ 2039.773266] BTRFS: has skinny extents
[ 2039.775221] BTRFS: failed to read chunk tree on sde
[ 2039.785544] BTRFS: open_ctree failed
[ 2039.892750] BTRFS info (device sde): btrfs: use no compression
[ 2039.892758] BTRFS info (device sde): disk space caching is enabled
[ 2039.892761] BTRFS: has skinny extents
[ 2039.894223] BTRFS: failed to read chunk tree on sde
[ 2039.909454] BTRFS: open_ctree failed

###
###
###

[root@rockstor ~]# btrfs fi sh
Label: 'rockstor_rockstor'  uuid: 43308bee-a17f-4b11-9bbf-e9bd3fcc7b6f
Total devices 2 FS bytes used 1.44GiB
devid1 size 110.88GiB used 3.02GiB path /dev/sda3
devid2 size 1.82TiB used 0.00B path /dev/sdf

warning, device 4 is missing
warning devid 4 not found already
Label: 'waldpark'  uuid: bc36f340-8b89-421e-9288-b573457ed7de
Total devices 4 FS bytes used 1.38TiB
devid1 size 1.82TiB used 708.01GiB path /dev/sdc
devid2 size 1.82TiB used 708.01GiB path /dev/sdd
devid3 size 1.82TiB used 707.54GiB path /dev/sde
*** Some devices missing

###
###
###

[root@rockstor ~]# btrfs-show-super /dev/sdc
superblock: bytenr=65536, device=/dev/sdc
-
csum0x81ca0c6d [match]
bytenr  65536
flags   0x1
( WRITTEN )
magic   _BHRfS_M [match]
fsidbc36f340-8b89-421e-9288-b573457ed7de
label   waldpark
generation  949
root1288110080
sys_array_size  290
chunk_root_generation   902
root_level  1
chunk_root  1519382429696
chunk_root_level1
log_root0
log_root_transid0
log_root_level  0
total_bytes 8001595736064
bytes_used  1517774807040
sectorsize  4096
nodesize16384
leafsize16384
stripesize  4096
root_dir6
num_devices 4
compat_flags0x0
compat_ro_flags 0x0
incompat_flags  0x161
( MIXED_BACKREF |
  BIG_METADATA |
  EXTENDED_IREF |
  SKINNY_METADATA )
csum_type   0
csum_size   4
cache_generation949
uuid_tree_generation949
dev_item.uuid   e6ede03b-2432-4d75-ad85-399730ce0bc9
dev_item.fsid   bc36f340-8b89-421e-9288-b573457ed7de [match]
dev_item.type   0
dev_item.total_bytes2000398934016
dev_item.bytes_used 76021760
dev_item.io_align   4096
dev_item.io_width   4096
dev_item.sector_size4096
dev_item.devid  1
dev_item.dev_group  0
dev_item.seek_speed 0
dev_item.bandwidth  0
dev_item.generation 0

###
###
###

[root@rockstor ~]# mount -o degraded /dev/sdc /mnt2/waldpark
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

###
###
###

[root@rockstor ~]# btrfs check /dev/sdc
warning, device 4 is missing
warning devid 4 not found already
Checking filesystem on /dev/sdc
UUID: bc36f340-8b89-421e-9288-b573457ed7de
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
checking quota groups
Ignoring qgroup relation key 258
Ignoring qgroup relation key 965
Ignoring qgroup relation key 966
Ignoring qgroup relation key 967
Ignoring qgroup relation key 567172078071971841
Ignoring qgroup relation key 567172078071971841
Ignoring qgroup relation key 567172078071971841
Ignoring qgroup relation key 567172078071971841
found 1517403873988 bytes used err is 0
total csum bytes: 1479836856
total tree bytes: 2050932736
total fs tree bytes: 451002368
total extent tree bytes: 51707904
btree space waste bytes: 112210244
file data blocks allocated: 1680396222464
 referenced 1666725031936
extent buffer leak: start 

Re: [PULL] Btrfs cleanups for 4.6, part 2

2016-03-15 Thread Chris Mason
On Tue, Mar 15, 2016 at 02:50:14PM +0100, David Sterba wrote:
> Hi,
> 
> a few more cleanups sent recently and some that I found in my inbox marked but
> not processed. Based on top of current integration. Please pull, thanks.

Thanks Dave, I'll get these pulled in and restart my long stress run.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue

2016-03-15 Thread Marc Haber
On Tue, Mar 15, 2016 at 09:54:06AM -0400, Austin S. Hemmelgarn wrote:
> On 2016-03-15 09:46, Marc Haber wrote:
> >On Tue, Mar 15, 2016 at 11:52:30AM +0100, Holger Hoffstätte wrote:
> >>On 03/14/16 21:13, Marc Haber wrote:
> >>>Do I need to wait for clear_cache to finish, like until I see disk
> >>>usage dropping?
> >>
> >>The cache isn't that big, so you won't see a huge drop. Just use the
> >>disk normally for a few minutes, after some time the cache will be
> >>written out again.
> >
> >Is it necessary to actually cause activity on the file system or is it
> >ok to just let it sit there for an hour or so?
> It should be OK to just let it sit there for ten or fifteen minutes. I'm
> pretty certain that the free space cache gets rebuilt relatively quickly,
> and I'm almost 100% certain that the old one gets dropped within seconds of
> the FS being mounted with -o clear_cache.  I've rebuilt the cache on the 64G
> root filesystem on my laptop a couple of times before, and it consistently
> appears to take about 2-3 minutes to do so at most (based on disk usage from
> the kernel itself).

In my case, atop has not seen any notable disk activity after mounting
with -o clerar_cache.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue

2016-03-15 Thread Austin S. Hemmelgarn

On 2016-03-15 09:46, Marc Haber wrote:

On Tue, Mar 15, 2016 at 11:52:30AM +0100, Holger Hoffstätte wrote:

On 03/14/16 21:13, Marc Haber wrote:

Do I need to wait for clear_cache to finish, like until I see disk
usage dropping?


The cache isn't that big, so you won't see a huge drop. Just use the
disk normally for a few minutes, after some time the cache will be
written out again.


Is it necessary to actually cause activity on the file system or is it
ok to just let it sit there for an hour or so?
It should be OK to just let it sit there for ten or fifteen minutes. I'm 
pretty certain that the free space cache gets rebuilt relatively 
quickly, and I'm almost 100% certain that the old one gets dropped 
within seconds of the FS being mounted with -o clear_cache.  I've 
rebuilt the cache on the 64G root filesystem on my laptop a couple of 
times before, and it consistently appears to take about 2-3 minutes to 
do so at most (based on disk usage from the kernel itself).


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL] Btrfs cleanups for 4.6, part 2

2016-03-15 Thread David Sterba
Hi,

a few more cleanups sent recently and some that I found in my inbox marked but
not processed. Based on top of current integration. Please pull, thanks.


The following changes since commit 5e33a2bd7ca7fa687fb0965869196eea6815d1f3:

  Btrfs: do not collect ordered extents when logging that inode exists 
(2016-03-01 08:23:47 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git cleanups-4.6

for you to fetch changes up to bb7ab3b92e46da06b580c6f83abe7894dc449cca:

  btrfs: Fix misspellings in comments. (2016-03-14 15:05:02 +0100)


Adam Buchbinder (1):
  btrfs: Fix misspellings in comments.

Anand Jain (2):
  btrfs: rename btrfs_print_info to btrfs_print_mod_info
  btrfs: move btrfs_compression_type to compression.h

Ashish Samant (1):
  btrfs: Print Warning only if ENOSPC_DEBUG is enabled

Dan Carpenter (1):
  btrfs: scrub: silence an uninitialized variable warning

David Sterba (1):
  Documentation: btrfs: remove usage specific information

Rasmus Villemoes (1):
  btrfs: use kbasename in btrfsic_mount

Satoru Takeuchi (1):
  Btrfs: Show a warning message if one of objectid reaches its highest value

 Documentation/filesystems/btrfs.txt | 274 ++--
 fs/btrfs/check-integrity.c  |  12 +-
 fs/btrfs/compression.h  |   9 ++
 fs/btrfs/ctree.h|  18 +--
 fs/btrfs/delayed-inode.c|   7 +-
 fs/btrfs/dev-replace.c  |   2 +-
 fs/btrfs/disk-io.c  |   3 +-
 fs/btrfs/extent-tree.c  |   2 +-
 fs/btrfs/extent_map.c   |   5 +-
 fs/btrfs/file-item.c|   1 +
 fs/btrfs/file.c |   3 +-
 fs/btrfs/inode-map.c|   3 +
 fs/btrfs/ioctl.c|   1 +
 fs/btrfs/ordered-data.c |   3 +-
 fs/btrfs/props.c|   1 +
 fs/btrfs/scrub.c|   2 +-
 fs/btrfs/send.c |   1 +
 fs/btrfs/super.c|   4 +-
 fs/btrfs/tests/inode-tests.c|   1 +
 fs/btrfs/tree-log.c |   3 +-
 fs/btrfs/volumes.c  |   4 +-
 21 files changed, 62 insertions(+), 297 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue

2016-03-15 Thread Marc Haber
On Tue, Mar 15, 2016 at 11:52:30AM +0100, Holger Hoffstätte wrote:
> On 03/14/16 21:13, Marc Haber wrote:
> > Do I need to wait for clear_cache to finish, like until I see disk
> > usage dropping?
> 
> The cache isn't that big, so you won't see a huge drop. Just use the
> disk normally for a few minutes, after some time the cache will be
> written out again.

Is it necessary to actually cause activity on the file system or is it
ok to just let it sit there for an hour or so?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber
On Tue, Mar 15, 2016 at 02:29:32PM +0100, Marc Haber wrote:
> After umounting and btrfs check the block device, things seem to be
> fine now

But, umounting the btrfs seemed to trigger the following kernel traces:

Mar 15 14:21:30 fan kernel: [92308.377104] [ cut here ]
Mar 15 14:21:30 fan kernel: [92308.377135] WARNING: CPU: 5 PID: 28243 at 
fs/btrfs/extent-tree.c:5380 bt
rfs_free_block_groups+0x1bc/0x36f [btrfs]()
Mar 15 14:21:30 fan kernel: [92308.377137] Modules linked in: vhost_net vhost 
macvtap macvlan tun iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp dummy ebtable_filter 
ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables 
cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative bridge 
stp llc snd_cmipci snd_hda_codec_realtek snd_hda_codec_generic 
snd_hda_codec_hdmi kvm_amd snd_mpu401_uart snd_opl3_lib snd_rawmidi kvm 
snd_hda_intel snd_seq_device snd_hda_codec snd_hda_core snd_hwdep 
amd64_edac_mod snd_pcm_oss edac_mce_amd irqbypass input_leds snd_mixer_oss 
pcspkr k10temp edac_core snd_pcm snd_timer snd i2c_piix4 asus_atk0110 soundcore 
acpi_cpufreq tpm_tis tpm sg processor evdev shpchp hwmon_vid autofs4 
crc32c_generic btrfs xor raid6_pq ext4 crc16 mbcache jbd2 hmac sha256_ssse3 
sha256_generic drbg ansi_cprng xts gf128mul algif_skcipher af_alg dm_crypt 
dm_mod hid_generic usbhid hid usb_storage sr_mod sd_mod cdrom ohci_pci r8169 
mii amdkfd radeon i2c_algo_bit ahci ttm sym53c8xx libahci xhci_pci 
scsi_transport_spi drm_kms_helper ohci_hcd ehci_pci xhci_hcd libata ehci_hcd 
drm usbcore scsi_mod usb_common i2c_core button
Mar 15 14:21:30 fan kernel: [92308.377203] CPU: 5 PID: 28243 Comm: umount Not 
tainted 4.4.5-zgws1 #2
Mar 15 14:21:30 fan kernel: [92308.377205] Hardware name: System manufacturer 
System Product Name/M5A88-V EVO, BIOS 160310/12/2012
Mar 15 14:21:30 fan kernel: [92308.377207]  005b 811dd418 
 0009
Mar 15 14:21:30 fan kernel: [92308.377210]  81051e21 a047a147 
880600a28000 
Mar 15 14:21:30 fan kernel: [92308.377212]  880600a28080 8805af7eea00 
a047a147 880600a28000
Mar 15 14:21:30 fan kernel: [92308.377215] Call Trace:
Mar 15 14:21:30 fan kernel: [92308.377221]  [] ? 
dump_stack+0x5a/0x6f
Mar 15 14:21:30 fan kernel: [92308.377224]  [] ? 
warn_slowpath_common+0x8e/0xa3
Mar 15 14:21:30 fan kernel: [92308.377239]  [] ? 
btrfs_free_block_groups+0x1bc/0x36f[btrfs]
Mar 15 14:21:30 fan kernel: [92308.377252]  [] ? 
btrfs_free_block_groups+0x1bc/0x36f[btrfs]
Mar 15 14:21:30 fan kernel: [92308.377267]  [] ? 
close_ctree+0x1e6/0x2f2 [btrfs]
Mar 15 14:21:30 fan kernel: [92308.377271]  [] ? 
generic_shutdown_super+0x64/0xdf
Mar 15 14:21:30 fan kernel: [92308.377273]  [] ? 
kill_anon_super+0x9/0xe
Mar 15 14:21:30 fan kernel: [92308.377285]  [] ? 
btrfs_kill_super+0xd/0x16 [btrfs]
Mar 15 14:21:30 fan kernel: [92308.377288]  [] ? 
deactivate_locked_super+0x2f/0x56
Mar 15 14:21:30 fan kernel: [92308.377291]  [] ? 
cleanup_mnt+0x4f/0x6b
Mar 15 14:21:30 fan kernel: [92308.377293]  [] ? 
task_work_run+0x5d/0x71
Mar 15 14:21:30 fan kernel: [92308.377296]  [] ? 
prepare_exit_to_usermode+0x70/0x99
Mar 15 14:21:30 fan kernel: [92308.377300]  [] ? 
int_ret_from_sys_call+0x25/0x8f
Mar 15 14:21:30 fan kernel: [92308.377302] ---[ end trace 18c6bb90b0c6c689 ]---

Mar 15 14:21:30 fan kernel: [92308.377303] [ cut here ]
Mar 15 14:21:30 fan kernel: [92308.377318] WARNING: CPU: 5 PID: 28243 at 
fs/btrfs/extent-tree.c:5381 btrfs_free_block_groups+0x1d7/0x36f [btrfs]()
Mar 15 14:21:30 fan kernel: [92308.377319] Modules linked in: vhost_net vhost 
macvtap macvlan tun iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp dummy ebtable_filter 
ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables 
cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative bridge 
stp llc snd_cmipci snd_hda_codec_realtek snd_hda_codec_generic 
snd_hda_codec_hdmi kvm_amd snd_mpu401_uart snd_opl3_lib snd_rawmidi kvm 
snd_hda_intel snd_seq_device snd_hda_codec snd_hda_core snd_hwdep 
amd64_edac_mod snd_pcm_oss edac_mce_amd irqbypass input_leds snd_mixer_oss 
pcspkr k10temp edac_core snd_pcm snd_timer snd i2c_piix4 asus_atk0110 soundcore 
acpi_cpufreq tpm_tis tpm sg processor evdev shpchp hwmon_vid autofs4 
crc32c_generic btrfs xor raid6_pq ext4 crc16 mbcache jbd2 hmac sha256_ssse3 
sha256_generic drbg ansi_cprng xts gf128mul algif_skcipher af_alg dm_crypt 
dm_mod hid_generic usbhid hid usb_storage sr_mod sd_mod cdrom ohci_pci r8169 
mii amdkfd radeon i2c_algo_bit ahci ttm sym53c8xx libahci xhci_pci 
scsi_transport_spi drm_kms_helper ohci_hcd ehci_pci xhci_hcd libata ehci_hcd 
drm usbcore scsi_mod usb_common i2c_core button
Mar 15 14:21:30 fan kernel: [92308.377362] CPU: 5 PID: 28243 Comm: umount 

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber
On Mon, Mar 14, 2016 at 09:05:46PM +0100, Marc Haber wrote:
> [10/509]mh@fan:~$ sudo btrfs check /media/tempdisk/
> Superblock bytenr is larger than device size
> Couldn't open file system
> [11/509]mh@fan:~$

After umounting and btrfs check the block device, things seem to be
fine now:

[34/532]mh@fan:~$ sudo btrfs check /dev/mapper/ofanbtr
Checking filesystem on /dev/mapper/ofanbtr
UUID: 4198d1bc-e3ce-40df-a7ee-44a2d120bff3
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 86554574954 bytes used err is 0
total csum bytes: 81815012
total tree bytes: 2476670976
total fs tree bytes: 2246311936
total extent tree bytes: 133201920
btree space waste bytes: 452859567
file data blocks allocated: 292994375680
 referenced 132664688640
[35/533]mh@fan:~$ sudo btrfs check /dev/mapper/ofanbtr
Checking filesystem on /dev/mapper/ofanbtr
UUID: 4198d1bc-e3ce-40df-a7ee-44a2d120bff3
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 86554574954 bytes used err is 0
total csum bytes: 81815012
total tree bytes: 2476670976
total fs tree bytes: 2246311936
total extent tree bytes: 133201920
btree space waste bytes: 452859567
file data blocks allocated: 292994375680
 referenced 132664688640
[36/533]mh@fan:~$

This does not indicate an error, does it?

Greetings
Marc, who would like to the tools a bit more explicit and consistent
in whether they want the fs mounted, umounted, the mountpoint or the
device on their command line

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber
On Tue, Mar 15, 2016 at 01:15:33PM +0100, Henk Slager wrote:
> On Tue, Mar 15, 2016 at 8:16 AM, Marc Haber  
> wrote:
> > On Tue, Mar 15, 2016 at 12:22:00AM +0100, Henk Slager wrote:
> >> The other question is: What is mounted on /media/tempdisk/  ?
> >
> > The "old" btrfs filesystem "ofanbtr", formerly 417 GB in size, now
> > resized to 300 GB. Does it need to be umounted to be checked?
> 
> Yes, that's the whole point
> 
> >> At least I think a check of the current 200GiB fs is needed. As it is
> >> a rootfs and encrypted, some work is needed to make that happen.
> >
> > You suggested a btrfs check after looking at the image of "ofanbtr".
> > Do you want me to check the new "fanbtr" also?
> 
> I was not sure if 'ofanbtr' is an image created by btrfs-image or a
> extra dd created image you might have locally. Both 'ofanbtr' and
> 'fanbtr' have the same balance issue, but 'fanbtr' is created with
> newer and known kernel+tools version I assume, so that's why the
> suggestion.

ofanbtr is the old btrfs, on /dev/mapper/ofanbtr:
Label: 'ofanbtr'  uuid: 4198d1bc-e3ce-40df-a7ee-44a2d120bff3
Total devices 1 FS bytes used 80.63GiB
devid1 size 300.00GiB used 122.06GiB path /dev/mapper/ofanbtr
it was created as 'fanbtr' in September, 300 GiB in Size, then - in
February, I think, resized to 417 MiB to make room for more data and
for balancing, used until March 7, and then renamed to ofanbtr with
lvrename and btrfs fi label. It was then imaged, and then resized back
to 300 GiB in the hope that this will fix the size issue.

fanbtr is the new btrfs, on /dev/mapper/fanbtr:
Label: 'fanbtr'  uuid: 90f8d728-6bae-4fca-8cda-b368ba2c008e
Total devices 1 FS bytes used 82.45GiB
devid1 size 200.00GiB used 113.03GiB path /dev/mapper/fanbtr
it was created on march 7, had the data from ofanbtr cp'ed over, and
is being used as the active filesystem since then. It is smaller
because I don't have much more room on the SSD.

Both do have the same balance issue, yes.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: cleaner_kthread() doesn't need explicit freeze

2016-03-15 Thread Jiri Kosina
On Tue, 15 Mar 2016, Jiri Kosina wrote:

> cleaner_kthread() is not marked freezable, and therefore calling 
> try_to_freeze() in its context is a pointless no-op.
> 
> In addition to that, as has been clearly demonstrated by 80ad623edd2d 
> ("Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()"), it's perfectly 
> valid / legal for cleaner_kthread() to stay scheduled out in an arbitrary 
> place during suspend (in that particular example that was waiting for 
> reading of extent pages), so there is no need to leave any traces of 
> freezer in this kthread.

Given some questions I've received offline, let me clarify a little bit 
more here.

Currently, the try_to_freeze() call is completely useless here, because it 
will never actually try to freeze the kthread (as it's PF_NOFREEZE).

When attempted to make the kthread properly freezable, it turned out (see 
e.g. 80ad623edd2d) that it's actually sleeping in various places during 
suspend for long periods of time (my guess would be that it doesn't really 
matter whether the cleaning happens before or after suspend, but this'd be 
something I'd like to have clarified from btrfs folks).

So in a nutshell, this patch (a) doesn't make things worse, as it's an 
equivalent code transformation (b) brings more sanity to how the kthread 
freezing API is used throughout the kernel.
It might very well be that the code was broken before; but it's not more 
broken after this patch, and the API usage is sane.

The ultimate goal is first to bring some sanity into how the freezer API 
is used throughout the kernel, and then eventually get rid of it 
completely in favor of fs freezing (currently it's not even possible to 
analyze all the uses in the kernel, as there are way too many and most of 
them are totally broken).

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Henk Slager
On Tue, Mar 15, 2016 at 8:16 AM, Marc Haber  wrote:
> On Tue, Mar 15, 2016 at 12:22:00AM +0100, Henk Slager wrote:
>> The other question is: What is mounted on /media/tempdisk/  ?
>
> The "old" btrfs filesystem "ofanbtr", formerly 417 GB in size, now
> resized to 300 GB. Does it need to be umounted to be checked?

Yes, that's the whole point

>> At least I think a check of the current 200GiB fs is needed. As it is
>> a rootfs and encrypted, some work is needed to make that happen.
>
> You suggested a btrfs check after looking at the image of "ofanbtr".
> Do you want me to check the new "fanbtr" also?

I was not sure if 'ofanbtr' is an image created by btrfs-image or a
extra dd created image you might have locally. Both 'ofanbtr' and
'fanbtr' have the same balance issue, but 'fanbtr' is created with
newer and known kernel+tools version I assume, so that's why the
suggestion.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue

2016-03-15 Thread Holger Hoffstätte
On 03/14/16 21:13, Marc Haber wrote:
> On Mon, Mar 14, 2016 at 01:48:18PM +0100, Holger Hoffstätte wrote:
>> did you ever try to clear the free-space cache via -o clear_cache
>> on mount?
> 
> This was not asked, and I didn't try. Since this is an encrypted root
> filesystem, is it a workable way to add clear_cache to /etc/fstab,
> rebuild initramfs and reboot? Or do you recommend using a rescue system?

If you can do it via a rescue system that might be easiest, but adding
it to fstab and rebooting once has the same effect. Whatever you know
how to do safely.

>> Give it a try, let it run for a while and then try balancing
>> again.
> 
> Do I need to wait for clear_cache to finish, like until I see disk
> usage dropping?

The cache isn't that big, so you won't see a huge drop. Just use the
disk normally for a few minutes, after some time the cache will be
written out again.

>> _Someone_ is lying to btrfs in terms of device size and/or allocated
>> chunks, otherwise you wouldn't get the ENOSPC.
> 
> Which properties does a block device report other than size?

Well..at least all you can find in /sys/block/sdX/*. However, reading
the other subthread about the mismatching image size I'm now none the
wiser what else to suggest. :/

-h
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] btrfs: transaction_kthread() is not freezable

2016-03-15 Thread Jiri Kosina
transaction_kthread() is calling try_to_freeze(), but that's just an 
expeinsive no-op given the fact that the thread is not marked freezable.

After removing this, disk-io.c is now independent on freezer API.

Signed-off-by: Jiri Kosina 
---
 fs/btrfs/disk-io.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d8d68af..4c7361a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -25,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -1920,14 +1919,12 @@ sleep:
if (unlikely(test_bit(BTRFS_FS_STATE_ERROR,
  >fs_info->fs_state)))
btrfs_cleanup_transaction(root);
-   if (!try_to_freeze()) {
-   set_current_state(TASK_INTERRUPTIBLE);
-   if (!kthread_should_stop() &&
-   (!btrfs_transaction_blocked(root->fs_info) ||
-cannot_commit))
-   schedule_timeout(delay);
-   __set_current_state(TASK_RUNNING);
-   }
+   set_current_state(TASK_INTERRUPTIBLE);
+   if (!kthread_should_stop() &&
+   (!btrfs_transaction_blocked(root->fs_info) ||
+cannot_commit))
+   schedule_timeout(delay);
+   __set_current_state(TASK_RUNNING);
} while (!kthread_should_stop());
return 0;
 }

-- 
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: cleaner_kthread() doesn't need explicit freeze

2016-03-15 Thread Jiri Kosina
cleaner_kthread() is not marked freezable, and therefore calling 
try_to_freeze() in its context is a pointless no-op.

In addition to that, as has been clearly demonstrated by 80ad623edd2d 
("Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()"), it's perfectly 
valid / legal for cleaner_kthread() to stay scheduled out in an arbitrary 
place during suspend (in that particular example that was waiting for 
reading of extent pages), so there is no need to leave any traces of 
freezer in this kthread.

Fixes: 80ad623edd2d ("Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()")
Fixes: 696249132158 ("btrfs: clear PF_NOFREEZE in cleaner_kthread()")
Signed-off-by: Jiri Kosina 
---
 fs/btrfs/disk-io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 4545e2e..d8d68af 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1830,7 +1830,7 @@ static int cleaner_kthread(void *arg)
 */
btrfs_delete_unused_bgs(root->fs_info);
 sleep:
-   if (!try_to_freeze() && !again) {
+   if (!again) {
set_current_state(TASK_INTERRUPTIBLE);
if (!kthread_should_stop())
schedule();
-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] block: test fallocate for block devices

2016-03-15 Thread Christoph Hellwig
On Tue, Mar 15, 2016 at 02:41:48PM +1100, Dave Chinner wrote:
> I think it's the right place to test it - we have all the
> infrastructure available to do it (i.e. xfs_io and various block
> devices) and we really need to make sure this stuff works,
> especially if we start to write filesystem code that depends on
> correct behaviour...

Ok.  But let's keep it outside of the auto group so we don't run purely
block device specific tests every time we do an xfstests run.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs/059: add a filter for btrfs compression property

2016-03-15 Thread Xiaoguang Wang
From: Wang Xiaoguang 

btrfs/059.out should not be hardcoded to zlib, if compression method
is lzo, this case will fail wrongly, so here add a filter.

Signed-off-by: Wang Xiaoguang 
---
 common/filter.btrfs |  4 
 tests/btrfs/059 | 16 +++-
 tests/btrfs/059.out |  6 +++---
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/common/filter.btrfs b/common/filter.btrfs
index 9bb6479..56cf4b2 100644
--- a/common/filter.btrfs
+++ b/common/filter.btrfs
@@ -65,7 +65,11 @@ _filter_transaction_commit() {
 _filter_btrfs_subvol_delete()
 {
_filter_scratch | _filter_transaction_commit
+}
 
+_filter_btrfs_compress_property()
+{
+   sed -e "s/compression=\(lzo\|zlib\)/COMPRESSION=XXX/g"
 }
 
 # make sure this script returns success
diff --git a/tests/btrfs/059 b/tests/btrfs/059
index f6c2e27..d7db1df 100755
--- a/tests/btrfs/059
+++ b/tests/btrfs/059
@@ -44,6 +44,7 @@ _cleanup()
 # get standard environment, filters and checks
 . ./common/rc
 . ./common/filter
+. ./common/filter.btrfs
 
 # real QA test starts here
 _supported_fs btrfs
@@ -61,24 +62,29 @@ mkdir $SCRATCH_MNT/testdir
 echo "Setting compression flag in the directory..."
 chattr +c $SCRATCH_MNT/testdir
 echo "Directory compression property value:"
-$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir compression
+$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir compression | \
+   _filter_btrfs_compress_property
 
 touch $SCRATCH_MNT/testdir/file1
 echo "file1 compression property value:"
-$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir/file1 compression
+$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir/file1 compression | \
+   _filter_btrfs_compress_property
 
 echo "Clearing compression flag from directory..."
 chattr -c $SCRATCH_MNT/testdir
 echo "Directory compression property value:"
-$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir compression
+$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir compression | \
+   _filter_btrfs_compress_property
 
 touch $SCRATCH_MNT/testdir/file2
 echo "file2 compression property value:"
-$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir/file2 compression
+$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir/file2 compression | \
+   _filter_btrfs_compress_property
 
 touch $SCRATCH_MNT/testdir/file1
 echo "file1 compression property value:"
-$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir/file1 compression
+$BTRFS_UTIL_PROG property get $SCRATCH_MNT/testdir/file1 compression | \
+   _filter_btrfs_compress_property
 
 status=0
 exit
diff --git a/tests/btrfs/059.out b/tests/btrfs/059.out
index 9ec9a53..4e7539a 100644
--- a/tests/btrfs/059.out
+++ b/tests/btrfs/059.out
@@ -1,11 +1,11 @@
 QA output created by 059
 Setting compression flag in the directory...
 Directory compression property value:
-compression=zlib
+COMPRESSION=XXX
 file1 compression property value:
-compression=zlib
+COMPRESSION=XXX
 Clearing compression flag from directory...
 Directory compression property value:
 file2 compression property value:
 file1 compression property value:
-compression=zlib
+COMPRESSION=XXX
-- 
1.8.3.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber
On Mon, Mar 14, 2016 at 01:00:13AM +0100, Henk Slager wrote:
> On Sun, Mar 13, 2016 at 9:56 PM, Marc Haber  
> wrote:
> > Yes, I want to keep the possibility to remove huge files from
> > snapshots that shouldnt have been on a snapshotted volume in the first
> > place without having to ditch the entire snapshot.
> 
> You could do ro snapshotting and in case you want to modify something
> inside a snapshot/subvolume:
> # btrfs property set  ro false
> # rm /
> # btrfs property set  ro true

I was not aware that it is possible to fiddle with the ro property of
an already existing snapshot. I am not yet sure whether I love or hate
this.

> >> Also, If some part of the OS or tools scans through the snapshot dirs
> >> every now and then with atime creation on, metadata grows without a
> >> real need.
> >
> > I mount with noatime and nodiratime anyway, and the directory the
> > snapshots are mounted to (/mnt/snapshots) are excluded in
> > updatedb.conf. Any other idea which tool might scan filesystems and
> > that might not be noticed when it's running about a five digit number
> > of snapshots?
> 
> Maybe baloo or so if you use KDE.

I usually do those tests via ssh without even being logged in to a
local desktop.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber
On Tue, Mar 15, 2016 at 12:22:00AM +0100, Henk Slager wrote:
> The other question is: What is mounted on /media/tempdisk/  ?

The "old" btrfs filesystem "ofanbtr", formerly 417 GB in size, now
resized to 300 GB. Does it need to be umounted to be checked?

> At least I think a check of the current 200GiB fs is needed. As it is
> a rootfs and encrypted, some work is needed to make that happen.

You suggested a btrfs check after looking at the image of "ofanbtr".
Do you want me to check the new "fanbtr" also?

Too bad that we went back to looking at "ofanbtr" after I changed the
subject to avoid mixing up both instances.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber
On Mon, Mar 14, 2016 at 09:39:51PM +0100, Henk Slager wrote:
> >> BTW, I restored and mounted your 20160307-fanbtr-image:
> >>
> >> [266169.207952] BTRFS: device label fanbtr devid 1 transid 22215732 
> >> /dev/loop0
> >> [266203.734804] BTRFS info (device loop0): disk space caching is enabled
> >> [266203.734806] BTRFS: has skinny extents
> >> [266204.022175] BTRFS: checking UUID tree
> >> [266239.407249] attempt to access beyond end of device
> >> [266239.407252] loop0: rw=1073, want=715202688, limit=70576
> >> [266239.407254] BTRFS error (device loop0): bdev /dev/loop0 errs: wr
> >> 1, rd 0, flush 0, corrupt 0, gen 0
> >> [266239.407272] attempt to access beyond end of device
> >> .. and 16 more
> >>
> >> As a quick fix/workaround, I truncated the image to 1T
> >
> > The original fs was 417 GiB in size. What size does the image claim?
> 
> ls -alFh  of the restored image showed 337G I remember.
> btrfs fi us showed also a number over 400G, I don't have the
> files/loopdev anymore.

sounds legit.

> It could some side effect of btrfs-image, I only have used it for
> multi-device, where dev id's are ignore, but total image size did not
> lead to problems.

The original "ofanbtr" seems to have a problem, since btrfs check
/media/tempdisk says:

> > [10/509]mh@fan:~$ sudo btrfs check /media/tempdisk/
> > Superblock bytenr is larger than device size
> > Couldn't open file system
> > [11/509]mh@fan:~$
> >
> > Can this be fixed?
> 
> What I would do in order to fix it, is resize the fs to let's say
> 190GiB. That should write correct values to the superblocks I /hope/.
> And then resize back to max.

It doesn't:
[20/518]mh@fan:~$ sudo btrfs filesystem resize 300G /media/tempdisk/
Resize '/media/tempdisk/' of '300G'
[22/520]mh@fan:~$ sudo btrfs check /media/tempdisk/
Superblock bytenr is larger than device size
Couldn't open file system
[23/521]mh@fan:~$ df -h

> Maybe btrfs check --repair can also fix it, but before doing --repair
> or other actions, I would see what else besides btrfs could be wrong,
> see also suggestion of Holger.

Like putting the filesystem on an unencrypted medium? Sorry, no,
private data, paranoia.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: move btrfs_compression_type to compression.h

2016-03-15 Thread Anand Jain



On 03/12/2016 12:11 AM, David Sterba wrote:

On Thu, Mar 10, 2016 at 05:26:59PM +0800, Anand Jain wrote:

So that its better organized.

Signed-off-by: Anand Jain 


Reviewed-by: David Sterba 


Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html