Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-11 Thread Oleg Drokin

> On Feb 11, 2018, at 6:44 PM, NeilBrown <ne...@suse.com> wrote:
> 
> On Thu, Feb 08 2018, Oleg Drokin wrote:
>> 
>> Certain things that sound useless (like the debug subsystem in Lustre)
>> is very useful when you have a 10k nodes in a cluster and need to selectively
>> pull stuff from a run to debug a complicated cross-node interaction.
>> I asked NFS people how do they do it and they don’t have anything that scales
>> and usually involves reducing the problem to a much smaller set of nodes 
>> first.
> 
> the "rpcdebug" stuff that Linux/nfs has is sometimes useful, but some parts
> are changing to tracepoints and some parts have remained, which is a
> little confusing.
> 
> The fact that lustre tracing seems to *always* log everything so that if
> something goes wrong you can extract that last few meg(?) of logs seems
> really useful.

Not really. Lustre also has a bitmask for logs (since otherwise all those prints
are pretty cpu taxing), but what makes those logs better is:
the size is unlimited, not constrained by dmesg buffer size.
You can capture those logs from a crashdump (something I really wish
somebody would implement for tracepoint buffers, but alas, I have not
found anything for this yet - we have a crash plugin to extract lustre
debug logs from a kernel crashdump).
>>> 
>>> Even if it is horrible it would be nice to have it in staging... I guess
>>> the changes required to ext4 prohibit that... I don't suppose it can be
>>> made to work with mainline ext4 in a reduced-functionality-and-performance
>>> way??
>> 
>> We support unpatched ZFS as a server too! ;)
> 
> So that that mean you would expect lustre-server to work with unpatched
> ext4? In that case I won't give up hope of seeing the server in mainline
> in my lifetime.  Client first though.

While unpatched ext4 might in theory be possible, currently it does not export
everything we need from the transaction/fs control perspective.

Bye,
Oleg


Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-11 Thread Oleg Drokin

> On Feb 11, 2018, at 6:44 PM, NeilBrown  wrote:
> 
> On Thu, Feb 08 2018, Oleg Drokin wrote:
>> 
>> Certain things that sound useless (like the debug subsystem in Lustre)
>> is very useful when you have a 10k nodes in a cluster and need to selectively
>> pull stuff from a run to debug a complicated cross-node interaction.
>> I asked NFS people how do they do it and they don’t have anything that scales
>> and usually involves reducing the problem to a much smaller set of nodes 
>> first.
> 
> the "rpcdebug" stuff that Linux/nfs has is sometimes useful, but some parts
> are changing to tracepoints and some parts have remained, which is a
> little confusing.
> 
> The fact that lustre tracing seems to *always* log everything so that if
> something goes wrong you can extract that last few meg(?) of logs seems
> really useful.

Not really. Lustre also has a bitmask for logs (since otherwise all those prints
are pretty cpu taxing), but what makes those logs better is:
the size is unlimited, not constrained by dmesg buffer size.
You can capture those logs from a crashdump (something I really wish
somebody would implement for tracepoint buffers, but alas, I have not
found anything for this yet - we have a crash plugin to extract lustre
debug logs from a kernel crashdump).
>>> 
>>> Even if it is horrible it would be nice to have it in staging... I guess
>>> the changes required to ext4 prohibit that... I don't suppose it can be
>>> made to work with mainline ext4 in a reduced-functionality-and-performance
>>> way??
>> 
>> We support unpatched ZFS as a server too! ;)
> 
> So that that mean you would expect lustre-server to work with unpatched
> ext4? In that case I won't give up hope of seeing the server in mainline
> in my lifetime.  Client first though.

While unpatched ext4 might in theory be possible, currently it does not export
everything we need from the transaction/fs control perspective.

Bye,
Oleg


Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-11 Thread Oleg Drokin

> On Feb 11, 2018, at 6:50 PM, NeilBrown  wrote:
> 
> Maybe - as you suggest in another email - it is due to some
> client/server incompatibility.  I guess it is unavoidable with an fs
> like lustre to have incompatible protocol changes.  Is there any
> mechanism for detecting the version of other peers in the cluster and
> refusing to run if versions are incompatible?

Yes, client and server exchange “feature bits” at connect time
and only use the subset of features that both can understand.

Bye,
Oleg


Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-11 Thread Oleg Drokin

> On Feb 11, 2018, at 6:50 PM, NeilBrown  wrote:
> 
> Maybe - as you suggest in another email - it is due to some
> client/server incompatibility.  I guess it is unavoidable with an fs
> like lustre to have incompatible protocol changes.  Is there any
> mechanism for detecting the version of other peers in the cluster and
> refusing to run if versions are incompatible?

Yes, client and server exchange “feature bits” at connect time
and only use the subset of features that both can understand.

Bye,
Oleg


Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread Oleg Drokin

> On Feb 8, 2018, at 10:10 PM, NeilBrown <ne...@suse.com> wrote:
> 
> On Thu, Feb 08 2018, Oleg Drokin wrote:
> 
>>> On Feb 8, 2018, at 8:39 PM, NeilBrown <ne...@suse.com> wrote:
>>> 
>>> On Tue, Aug 16 2016, James Simmons wrote:
>> 
>> my that’s an old patch
>> 
>>> 
> ...
>>> 
>>> Whoever converted it to "!strcmp()" inverted the condition.  This is a
>>> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
>>> 
>>> This causes many tests in the 'sanity' test suite to return
>>> -ENOMEM (that had me puzzled for a while!!).
>> 
>> huh? I am not seeing anything of the sort and I was running sanity
>> all the time until a recent pause (but going to resume).
> 
> That does surprised me - I reproduce it every time.
> I have two VMs running a SLE12-SP2 kernel with patches from
> lustre-release applied.  These are servers. They have 2 3G virtual disks
> each.
> I have two over VMs running current mainline.  These are clients.
> 
> I guess your 'recent pause' included between v4.15-rc1 (8e55b6fd0660)
> and v4.15-rc6 (a93639090a27) - a full month when lustre wouldn't work at
> all :-(

More than that, but I am pretty sure James Simmons is running tests all the 
time too
(he has a different config, I only have tcp).

>>> This seems to suggest that no-one has been testing the mainline linux
>>> lustre.
>>> It also seems to suggest that there is a good chance that there
>>> are other bugs that have crept in while no-one has really been caring.
>>> Given that the sanity test suite doesn't complete for me, but just
>>> hangs (in test_27z I think), that seems particularly likely.
>> 
>> Works for me, here’s a run from earlier today on 4.15.0:
> 
> Well that's encouraging .. I haven't looked into this one yet - I'm not
> even sure where to start.

m… debug logs for example (greatly neutered in staging tree, but still useful)?
try lctl dk and see what’s in there.

>> Instead the plan was to clean up the staging client into acceptable state,
>> move it out of staging, bring in all the missing features and then
>> drop the client (more or less) from the lustre-release.
> 
> That sounds like a great plan.  Any idea why it didn't happen?

Because meeting open-ended demands is hard and certain demands sound like
“throw away your X and rewrite it from scratch" (e.g. everything IB-related).

Certain things that sound useless (like the debug subsystem in Lustre)
is very useful when you have a 10k nodes in a cluster and need to selectively
pull stuff from a run to debug a complicated cross-node interaction.
I asked NFS people how do they do it and they don’t have anything that scales
and usually involves reducing the problem to a much smaller set of nodes first.

> It seems there is a lot of upstream work mixed in with the clean up, and
> I don't think that really helps anyone.

I don’t understand what you mean here.

> Is it at all realistic that the client might be removed from
> lustre-release?  That might be a good goal to work towards.

Assuming we can bring the whole functionality over - sure.

Of course there’d still be some separate development place and we would
need to create patches (new features?) for like SuSE and other distros
and for testing of server features, I guess, but that could just that -
a side branch somewhere I hope.

It’s not that we are super glad to chase every kernel vendors put out,
of course it would be much easier if the kernels already included
a very functional Lustre client.

>>> Might it make sense to instead start cleaning up the code in
>>> lustre-release so as to make it meet the upstream kernel standards.
>>> Then when the time is right, the kernel code can be moved *out* of
>>> lustre-release and *in* to linux.  Then development can continue in
>>> Linux (just like it does with other Linux filesystems).
>> 
>> While we can be cleaning lustre in lustre-release, there are some things
>> we cannot do as easily, e.g. decoupling Lustre client from the server.
>> Also it would not attract any reviews from all the janitor or
>> (more importantly) Al Viro and other people with a sharp eyes.
>> 
>>> An added bonus of this is that there is an obvious path to getting
>>> server support in mainline Linux.  The current situation of client-only
>>> support seems weird given how interdependent the two are.
>> 
>> Given the pushback Lustre client was given I have no hope Lustre server
>> will get into mainline in my lifetime.
> 
> Even if it is horrible it would be nice to have it in staging... I guess
> the changes required to e

Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread Oleg Drokin

> On Feb 8, 2018, at 10:10 PM, NeilBrown  wrote:
> 
> On Thu, Feb 08 2018, Oleg Drokin wrote:
> 
>>> On Feb 8, 2018, at 8:39 PM, NeilBrown  wrote:
>>> 
>>> On Tue, Aug 16 2016, James Simmons wrote:
>> 
>> my that’s an old patch
>> 
>>> 
> ...
>>> 
>>> Whoever converted it to "!strcmp()" inverted the condition.  This is a
>>> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
>>> 
>>> This causes many tests in the 'sanity' test suite to return
>>> -ENOMEM (that had me puzzled for a while!!).
>> 
>> huh? I am not seeing anything of the sort and I was running sanity
>> all the time until a recent pause (but going to resume).
> 
> That does surprised me - I reproduce it every time.
> I have two VMs running a SLE12-SP2 kernel with patches from
> lustre-release applied.  These are servers. They have 2 3G virtual disks
> each.
> I have two over VMs running current mainline.  These are clients.
> 
> I guess your 'recent pause' included between v4.15-rc1 (8e55b6fd0660)
> and v4.15-rc6 (a93639090a27) - a full month when lustre wouldn't work at
> all :-(

More than that, but I am pretty sure James Simmons is running tests all the 
time too
(he has a different config, I only have tcp).

>>> This seems to suggest that no-one has been testing the mainline linux
>>> lustre.
>>> It also seems to suggest that there is a good chance that there
>>> are other bugs that have crept in while no-one has really been caring.
>>> Given that the sanity test suite doesn't complete for me, but just
>>> hangs (in test_27z I think), that seems particularly likely.
>> 
>> Works for me, here’s a run from earlier today on 4.15.0:
> 
> Well that's encouraging .. I haven't looked into this one yet - I'm not
> even sure where to start.

m… debug logs for example (greatly neutered in staging tree, but still useful)?
try lctl dk and see what’s in there.

>> Instead the plan was to clean up the staging client into acceptable state,
>> move it out of staging, bring in all the missing features and then
>> drop the client (more or less) from the lustre-release.
> 
> That sounds like a great plan.  Any idea why it didn't happen?

Because meeting open-ended demands is hard and certain demands sound like
“throw away your X and rewrite it from scratch" (e.g. everything IB-related).

Certain things that sound useless (like the debug subsystem in Lustre)
is very useful when you have a 10k nodes in a cluster and need to selectively
pull stuff from a run to debug a complicated cross-node interaction.
I asked NFS people how do they do it and they don’t have anything that scales
and usually involves reducing the problem to a much smaller set of nodes first.

> It seems there is a lot of upstream work mixed in with the clean up, and
> I don't think that really helps anyone.

I don’t understand what you mean here.

> Is it at all realistic that the client might be removed from
> lustre-release?  That might be a good goal to work towards.

Assuming we can bring the whole functionality over - sure.

Of course there’d still be some separate development place and we would
need to create patches (new features?) for like SuSE and other distros
and for testing of server features, I guess, but that could just that -
a side branch somewhere I hope.

It’s not that we are super glad to chase every kernel vendors put out,
of course it would be much easier if the kernels already included
a very functional Lustre client.

>>> Might it make sense to instead start cleaning up the code in
>>> lustre-release so as to make it meet the upstream kernel standards.
>>> Then when the time is right, the kernel code can be moved *out* of
>>> lustre-release and *in* to linux.  Then development can continue in
>>> Linux (just like it does with other Linux filesystems).
>> 
>> While we can be cleaning lustre in lustre-release, there are some things
>> we cannot do as easily, e.g. decoupling Lustre client from the server.
>> Also it would not attract any reviews from all the janitor or
>> (more importantly) Al Viro and other people with a sharp eyes.
>> 
>>> An added bonus of this is that there is an obvious path to getting
>>> server support in mainline Linux.  The current situation of client-only
>>> support seems weird given how interdependent the two are.
>> 
>> Given the pushback Lustre client was given I have no hope Lustre server
>> will get into mainline in my lifetime.
> 
> Even if it is horrible it would be nice to have it in staging... I guess
> the changes required to ext4 prohibit that... I don't suppose it can be
> 

Re: [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread Oleg Drokin

> On Feb 8, 2018, at 8:39 PM, NeilBrown  wrote:
> 
> On Tue, Aug 16 2016, James Simmons wrote:

my that’s an old patch

> 
>> 
>> +static inline bool
>> +lsm_md_eq(const struct lmv_stripe_md *lsm1, const struct lmv_stripe_md 
>> *lsm2)
>> +{
>> +int idx;
>> +
>> +if (lsm1->lsm_md_magic != lsm2->lsm_md_magic ||
>> +lsm1->lsm_md_stripe_count != lsm2->lsm_md_stripe_count ||
>> +lsm1->lsm_md_master_mdt_index != lsm2->lsm_md_master_mdt_index ||
>> +lsm1->lsm_md_hash_type != lsm2->lsm_md_hash_type ||
>> +lsm1->lsm_md_layout_version != lsm2->lsm_md_layout_version ||
>> +!strcmp(lsm1->lsm_md_pool_name, lsm2->lsm_md_pool_name))
>> +return false;
> 
> Hi James and all,
> This patch (8f18c8a48b736c2f in linux) is different from the
> corresponding patch in lustre-release (60e07b972114df).
> 
> In that patch, the last clause in the 'if' condition is
> 
> +   strcmp(lsm1->lsm_md_pool_name,
> + lsm2->lsm_md_pool_name) != 0)
> 
> Whoever converted it to "!strcmp()" inverted the condition.  This is a
> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
> 
> This causes many tests in the 'sanity' test suite to return
> -ENOMEM (that had me puzzled for a while!!).

huh? I am not seeing anything of the sort and I was running sanity
all the time until a recent pause (but going to resume).

> This seems to suggest that no-one has been testing the mainline linux
> lustre.
> It also seems to suggest that there is a good chance that there
> are other bugs that have crept in while no-one has really been caring.
> Given that the sanity test suite doesn't complete for me, but just
> hangs (in test_27z I think), that seems particularly likely.

Works for me, here’s a run from earlier today on 4.15.0:
== sanity test 27z: check SEQ/OID on the MDT and OST filesystems 
= 16:43:58 (1518126238)
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0169548 s, 61.8 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.02782 s, 75.4 MB/s
check file /mnt/lustre/d27z.sanity/f27z.sanity-1
FID seq 0x20401, oid 0x4640 ver 0x0
LOV seq 0x20401, oid 0x4640, count: 1
want: stripe:0 ost:0 oid:314/0x13a seq:0
Stopping /mnt/lustre-ost1 (opts:) on centos6-17
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Starting ost1:   -o loop /tmp/lustre-ost1 /mnt/lustre-ost1
Failed to initialize ZFS library: 256
h2tcp: deprecated, use h2nettype instead
centos6-17.localnet: executing set_default_debug vfstrace rpctrace dlmtrace 
neterror ha config ioctl super all -lnet -lnd -pinger 16
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Started lustre-OST
/mnt/lustre-ost1/O/0/d26/314: parent=[0x20401:0x4640:0x0] stripe=0 
stripe_size=0 stripe_count=0
check file /mnt/lustre/d27z.sanity/f27z.sanity-2
FID seq 0x20401, oid 0x4642 ver 0x0
LOV seq 0x20401, oid 0x4642, count: 2
want: stripe:0 ost:1 oid:1187/0x4a3 seq:0
Stopping /mnt/lustre-ost2 (opts:) on centos6-17
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Starting ost2:   -o loop /tmp/lustre-ost2 /mnt/lustre-ost2
Failed to initialize ZFS library: 256
h2tcp: deprecated, use h2nettype instead
centos6-17.localnet: executing set_default_debug vfstrace rpctrace dlmtrace 
neterror ha config ioctl super all -lnet -lnd -pinger 16
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Started lustre-OST0001
/mnt/lustre-ost2/O/0/d3/1187: parent=[0x20401:0x4642:0x0] stripe=0 
stripe_size=0 stripe_count=0
want: stripe:1 ost:0 oid:315/0x13b seq:0
got: objid=0 seq=0 parent=[0x20401:0x4642:0x0] stripe=1
Resetting fail_loc on all nodes...done.
16:44:32 (1518126272) waiting for centos6-16 network 5 secs ...
16:44:32 (1518126272) network interface is UP
16:44:33 (1518126273) waiting for centos6-17 network 5 secs ...
16:44:33 (1518126273) network interface is UP


> So my real question - to anyone interested in lustre for mainline linux
> - is: can we actually trust this code at all?

Absolutely. Seems that you just stumbled upon a corner case that was not
being hit by people that do the testing, so you have something unique about
your setup, I guess.

> I'm seriously tempted to suggest that we just
>  rm -r drivers/staging/lustre
> 
> drivers/staging is great for letting the community work on code that has
> been "thrown over the wall" and is not openly developed elsewhere, but
> that is not the case for lustre.  lustre has (or seems to have) an open
> development process.  Having on-going development happen both there and
> in drivers/staging seems a waste of resources.

Re: [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread Oleg Drokin

> On Feb 8, 2018, at 8:39 PM, NeilBrown  wrote:
> 
> On Tue, Aug 16 2016, James Simmons wrote:

my that’s an old patch

> 
>> 
>> +static inline bool
>> +lsm_md_eq(const struct lmv_stripe_md *lsm1, const struct lmv_stripe_md 
>> *lsm2)
>> +{
>> +int idx;
>> +
>> +if (lsm1->lsm_md_magic != lsm2->lsm_md_magic ||
>> +lsm1->lsm_md_stripe_count != lsm2->lsm_md_stripe_count ||
>> +lsm1->lsm_md_master_mdt_index != lsm2->lsm_md_master_mdt_index ||
>> +lsm1->lsm_md_hash_type != lsm2->lsm_md_hash_type ||
>> +lsm1->lsm_md_layout_version != lsm2->lsm_md_layout_version ||
>> +!strcmp(lsm1->lsm_md_pool_name, lsm2->lsm_md_pool_name))
>> +return false;
> 
> Hi James and all,
> This patch (8f18c8a48b736c2f in linux) is different from the
> corresponding patch in lustre-release (60e07b972114df).
> 
> In that patch, the last clause in the 'if' condition is
> 
> +   strcmp(lsm1->lsm_md_pool_name,
> + lsm2->lsm_md_pool_name) != 0)
> 
> Whoever converted it to "!strcmp()" inverted the condition.  This is a
> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
> 
> This causes many tests in the 'sanity' test suite to return
> -ENOMEM (that had me puzzled for a while!!).

huh? I am not seeing anything of the sort and I was running sanity
all the time until a recent pause (but going to resume).

> This seems to suggest that no-one has been testing the mainline linux
> lustre.
> It also seems to suggest that there is a good chance that there
> are other bugs that have crept in while no-one has really been caring.
> Given that the sanity test suite doesn't complete for me, but just
> hangs (in test_27z I think), that seems particularly likely.

Works for me, here’s a run from earlier today on 4.15.0:
== sanity test 27z: check SEQ/OID on the MDT and OST filesystems 
= 16:43:58 (1518126238)
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0169548 s, 61.8 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.02782 s, 75.4 MB/s
check file /mnt/lustre/d27z.sanity/f27z.sanity-1
FID seq 0x20401, oid 0x4640 ver 0x0
LOV seq 0x20401, oid 0x4640, count: 1
want: stripe:0 ost:0 oid:314/0x13a seq:0
Stopping /mnt/lustre-ost1 (opts:) on centos6-17
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Starting ost1:   -o loop /tmp/lustre-ost1 /mnt/lustre-ost1
Failed to initialize ZFS library: 256
h2tcp: deprecated, use h2nettype instead
centos6-17.localnet: executing set_default_debug vfstrace rpctrace dlmtrace 
neterror ha config ioctl super all -lnet -lnd -pinger 16
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Started lustre-OST
/mnt/lustre-ost1/O/0/d26/314: parent=[0x20401:0x4640:0x0] stripe=0 
stripe_size=0 stripe_count=0
check file /mnt/lustre/d27z.sanity/f27z.sanity-2
FID seq 0x20401, oid 0x4642 ver 0x0
LOV seq 0x20401, oid 0x4642, count: 2
want: stripe:0 ost:1 oid:1187/0x4a3 seq:0
Stopping /mnt/lustre-ost2 (opts:) on centos6-17
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Starting ost2:   -o loop /tmp/lustre-ost2 /mnt/lustre-ost2
Failed to initialize ZFS library: 256
h2tcp: deprecated, use h2nettype instead
centos6-17.localnet: executing set_default_debug vfstrace rpctrace dlmtrace 
neterror ha config ioctl super all -lnet -lnd -pinger 16
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Started lustre-OST0001
/mnt/lustre-ost2/O/0/d3/1187: parent=[0x20401:0x4642:0x0] stripe=0 
stripe_size=0 stripe_count=0
want: stripe:1 ost:0 oid:315/0x13b seq:0
got: objid=0 seq=0 parent=[0x20401:0x4642:0x0] stripe=1
Resetting fail_loc on all nodes...done.
16:44:32 (1518126272) waiting for centos6-16 network 5 secs ...
16:44:32 (1518126272) network interface is UP
16:44:33 (1518126273) waiting for centos6-17 network 5 secs ...
16:44:33 (1518126273) network interface is UP


> So my real question - to anyone interested in lustre for mainline linux
> - is: can we actually trust this code at all?

Absolutely. Seems that you just stumbled upon a corner case that was not
being hit by people that do the testing, so you have something unique about
your setup, I guess.

> I'm seriously tempted to suggest that we just
>  rm -r drivers/staging/lustre
> 
> drivers/staging is great for letting the community work on code that has
> been "thrown over the wall" and is not openly developed elsewhere, but
> that is not the case for lustre.  lustre has (or seems to have) an open
> development process.  Having on-going development happen both there and
> in drivers/staging seems a waste of resources.

It is a bit of 

Re: [PATCH 03/12] staging: lustre: llite: fix various issues with ll_splice_alias.

2017-07-19 Thread Oleg Drokin

On Jul 19, 2017, at 12:33 AM, NeilBrown wrote:

> On Tue, Jul 18 2017, Oleg Drokin wrote:
> 
>> Unfortunately this patch causes insta-crash on first stat call after mount.
>> Sorry, I cannot dig into this deeper right this moment, but I will a bit 
>> later.
> 
> V.strange.  The crash suggests that the lock, and hence the inode, is
> not initialized.  I cannot see how that might happen.
> though...
> 
>>> +   spin_lock(>lli_lock);
>>> +   new = ll_find_invalid_alias(inode, de);
>>> +   if (!new)
>>> +   d_add(de, inode);
>>> +   spin_lock(>lli_lock);
> 
> Had it not crashed, it would have deadlocked.  That second spin_lock()
> should be spin_unlock() :-( I don't *think* that would have caused this crash…

No, that's not it.
-   d_add(de, inode);
-   CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
-  de, d_inode(de), d_count(de), de->d_flags);
+   de = d_splice_alias(inode, de);
+   if (!IS_ERR(de))
+   CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
+  de, d_inode(de), d_count(de), de->d_flags);
return de;

This is likely it.
d_splice_alias would return NULL if there's no alias, after d_add of the de.
But ll_splice_alias callers expect either an ERR_PTR on error or dentry to use
otherwise.
So in ll_lookup_it_finish() we have:
alias = ll_splice_alias(inode, *de);
if (IS_ERR(alias)) {
rc = PTR_ERR(alias);
goto out;
}
*de = alias;
…
if (md_revalidate_lock(ll_i2mdexp(parent), _it, ,
   NULL)) {
===>>> whoops!  d_lustre_revalidate(*de);
ll_intent_release(_it);
}

>> I am adding Al that we discussed this code at some length and he found no 
>> problems
>> here, so I am a bit surprised by your findings.
> I'd be very happy to read Al's thoughts.

Some of the discussions were in lkml under subject of "More parallel
atomic_open/d_splice_alias fun with NFS and possibly more FSes"
in July 2016, in between nfs stuff.
There was more but I cannot readily find it.

>> Also the reason we reinvent the d_splice_alias is because we need to
>> splice not just directories, but also regular files.
> 
> I see that.  A key simplification I bring is that directories and
> non-directories can be handled separately.  d_splice_alias() does
> all we need for directories, and nothing useful for non-dirs.

I see.
I still need to  think some more about this whole thing.

Please see commit 99f1c013194e64d4b67d5d318148303b0e1585e1 for double d_add
of the same dentry fix.6



Re: [PATCH 03/12] staging: lustre: llite: fix various issues with ll_splice_alias.

2017-07-19 Thread Oleg Drokin

On Jul 19, 2017, at 12:33 AM, NeilBrown wrote:

> On Tue, Jul 18 2017, Oleg Drokin wrote:
> 
>> Unfortunately this patch causes insta-crash on first stat call after mount.
>> Sorry, I cannot dig into this deeper right this moment, but I will a bit 
>> later.
> 
> V.strange.  The crash suggests that the lock, and hence the inode, is
> not initialized.  I cannot see how that might happen.
> though...
> 
>>> +   spin_lock(>lli_lock);
>>> +   new = ll_find_invalid_alias(inode, de);
>>> +   if (!new)
>>> +   d_add(de, inode);
>>> +   spin_lock(>lli_lock);
> 
> Had it not crashed, it would have deadlocked.  That second spin_lock()
> should be spin_unlock() :-( I don't *think* that would have caused this crash…

No, that's not it.
-   d_add(de, inode);
-   CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
-  de, d_inode(de), d_count(de), de->d_flags);
+   de = d_splice_alias(inode, de);
+   if (!IS_ERR(de))
+   CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
+  de, d_inode(de), d_count(de), de->d_flags);
return de;

This is likely it.
d_splice_alias would return NULL if there's no alias, after d_add of the de.
But ll_splice_alias callers expect either an ERR_PTR on error or dentry to use
otherwise.
So in ll_lookup_it_finish() we have:
alias = ll_splice_alias(inode, *de);
if (IS_ERR(alias)) {
rc = PTR_ERR(alias);
goto out;
}
*de = alias;
…
if (md_revalidate_lock(ll_i2mdexp(parent), _it, ,
   NULL)) {
===>>> whoops!  d_lustre_revalidate(*de);
ll_intent_release(_it);
}

>> I am adding Al that we discussed this code at some length and he found no 
>> problems
>> here, so I am a bit surprised by your findings.
> I'd be very happy to read Al's thoughts.

Some of the discussions were in lkml under subject of "More parallel
atomic_open/d_splice_alias fun with NFS and possibly more FSes"
in July 2016, in between nfs stuff.
There was more but I cannot readily find it.

>> Also the reason we reinvent the d_splice_alias is because we need to
>> splice not just directories, but also regular files.
> 
> I see that.  A key simplification I bring is that directories and
> non-directories can be handled separately.  d_splice_alias() does
> all we need for directories, and nothing useful for non-dirs.

I see.
I still need to  think some more about this whole thing.

Please see commit 99f1c013194e64d4b67d5d318148303b0e1585e1 for double d_add
of the same dentry fix.6



Re: [PATCH 03/12] staging: lustre: llite: fix various issues with ll_splice_alias.

2017-07-18 Thread Oleg Drokin
Unfortunately this patch causes insta-crash on first stat call after mount.
Sorry, I cannot dig into this deeper right this moment, but I will a bit later.
I am adding Al that we discussed this code at some length and he found no 
problems
here, so I am a bit surprised by your findings.
Also the reason we reinvent the d_splice_alias is because we need to
splice not just directories, but also regular files.

I also am less sure by your previous DCACHE_DISCONECTED patch that we in fact 
might
still need, I just need to dig up a test case for that.

Thanks for looking into it!

[  170.000858] Lustre: Mounted lustre-client
[  172.799813] Lustre: DEBUG MARKER: Using TIMEOUT=20
[  186.627954] BUG: unable to handle kernel NULL pointer dereference at 
00a8
[  186.628742] IP: __lock_acquire+0x125/0x1370
[  186.629137] PGD 0 
[  186.629138] P4D 0 
[  186.629496] 
[  186.630216] Oops:  [#1] SMP DEBUG_PAGEALLOC
[  186.630613] Modules linked in: osc(C) mgc(C) lustre(C) lmv(C) fld(C) mdc(C) 
fid(C) lov(C) ksocklnd(C) ptlrpc(C) obdclass(C) lnet(C) sha512_ssse3 
sha512_generic crc32_generic libcfs(C) joydev pcspkr i2c_piix4 virtio_console 
rpcsec_gss_krb5 ttm drm_kms_helper syscopyarea sysfillrect sysimgblt 
fb_sys_fops drm serio_raw virtio_blk floppy
[  186.633238] CPU: 2 PID: 10897 Comm: sh Tainted: G C  
4.13.0-rc1-vm-nfs+ #154
[  186.633990] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  186.634402] task: 8800d6a1c180 task.stack: c900066a
[  186.634807] RIP: 0010:__lock_acquire+0x125/0x1370
[  186.635206] RSP: 0018:c900066a3750 EFLAGS: 00010002
[  186.635597] RAX: 0046 RBX: 0001 RCX: 
[  186.636013] RDX: 0001 RSI:  RDI: 
[  186.636432] RBP: c900066a3810 R08: a05221a9 R09: 
[  186.636844] R10:  R11: 8800d6a1c180 R12: 0001
[  186.637264] R13:  R14: 0001 R15: 00a8
[  186.637679] FS:  7fd679eaa700() GS:88011a20() 
knlGS:
[  186.638402] CS:  0010 DS:  ES:  CR0: 80050033
[  186.638803] CR2: 00a8 CR3: 000109c0b000 CR4: 06e0
[  186.639228] Call Trace:
[  186.639589]  lock_acquire+0xe3/0x1d0
[  186.639962]  ? lock_acquire+0xe3/0x1d0
[  186.640352]  ? ll_lookup_it_finish+0x379/0xca0 [lustre]
[  186.640747]  _raw_spin_lock+0x34/0x70
[  186.641130]  ? ll_lookup_it_finish+0x379/0xca0 [lustre]
[  186.641529]  ll_lookup_it_finish+0x379/0xca0 [lustre]
[  186.641943]  ? req_capsule_server_get+0x15/0x20 [ptlrpc]
[  186.642368]  ? lmv_revalidate_slaves+0x790/0x790 [lmv]
[  186.642779]  ll_lookup_it+0x26d/0x820 [lustre]
[  186.643175]  ll_lookup_nd+0x162/0x1a0 [lustre]
[  186.643575]  lookup_slow+0x132/0x220
[  186.643947]  ? __wake_up+0x23/0x50
[  186.644322]  walk_component+0x1bf/0x350
[  186.644714]  link_path_walk+0x1b8/0x630
[  186.645097]  path_lookupat+0x99/0x220
[  186.645459]  ? __kernel_map_pages+0x131/0x140
[  186.645830]  ? __kernel_map_pages+0x131/0x140
[  186.646206]  filename_lookup+0xb8/0x1a0
[  186.646575]  ? __check_object_size+0xb1/0x1a0
[  186.646952]  ? strncpy_from_user+0x4d/0x160
[  186.647326]  user_path_at_empty+0x36/0x40
[  186.647694]  ? user_path_at_empty+0x36/0x40
[  186.648068]  vfs_statx+0x76/0xe0
[  186.648429]  SYSC_newstat+0x3d/0x70
[  186.648789]  ? trace_hardirqs_on_caller+0xf4/0x190
[  186.649171]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  186.649547]  SyS_newstat+0xe/0x10
[  186.649906]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[  186.650289] RIP: 0033:0x7fd679599475
[  186.650651] RSP: 002b:7ffc2e568fc8 EFLAGS: 0246 ORIG_RAX: 
0004
[  186.651350] RAX: ffda RBX: 7fd679862ae0 RCX: 7fd679599475
[  186.651758] RDX: 7ffc2e568fe0 RSI: 7ffc2e568fe0 RDI: 00eec2877730
[  186.652171] RBP: 7fd679862ae0 R08: 00eec2cafcb0 R09: 0008
[  186.652579] R10: 00eec2cafcb0 R11: 0246 R12: 0020
[  186.652982] R13: 00eec2cc7e10 R14: 00eec2cb2190 R15: 7ffc2e5690f8
[  186.653393] Code: c8 65 48 33 3c 25 28 00 00 00 44 89 e0 0f 85 91 0d 00 00 
48 81 c4 90 00 00 00 5b 41 5a 41 5c 41 5d 41 5e 41 5f 5d 49 8d 62 f8 c3 <49> 81 
3f 00 90 6f 82 41 bc 00 00 00 00 44 0f 45 e2 83 fe 01 0f 
[  186.654536] RIP: __lock_acquire+0x125/0x1370 RSP: c900066a3750
[  186.654933] CR2: 00a8


On Jul 18, 2017, at 7:26 PM, NeilBrown wrote:

> 1/ The testing of DCACHE_DISCONNECTED is wrong.
>  see upstream commit da093a9b76ef ("dcache: d_splice_alias should
>  ignore DCACHE_DISCONNECTED")
> 
>  As this is a notoriously difficult piece of code to get right,
>  it makes sense to use d_splice_alias() directly and no try to
>  create a local version of it.
> 
> 2/ ll_find_alias() currently:
> locks and alias
> checks that it is the one we want
> unlock it
> locks it again
> gets a reference
> unlocks it
> 

Re: [PATCH 03/12] staging: lustre: llite: fix various issues with ll_splice_alias.

2017-07-18 Thread Oleg Drokin
Unfortunately this patch causes insta-crash on first stat call after mount.
Sorry, I cannot dig into this deeper right this moment, but I will a bit later.
I am adding Al that we discussed this code at some length and he found no 
problems
here, so I am a bit surprised by your findings.
Also the reason we reinvent the d_splice_alias is because we need to
splice not just directories, but also regular files.

I also am less sure by your previous DCACHE_DISCONECTED patch that we in fact 
might
still need, I just need to dig up a test case for that.

Thanks for looking into it!

[  170.000858] Lustre: Mounted lustre-client
[  172.799813] Lustre: DEBUG MARKER: Using TIMEOUT=20
[  186.627954] BUG: unable to handle kernel NULL pointer dereference at 
00a8
[  186.628742] IP: __lock_acquire+0x125/0x1370
[  186.629137] PGD 0 
[  186.629138] P4D 0 
[  186.629496] 
[  186.630216] Oops:  [#1] SMP DEBUG_PAGEALLOC
[  186.630613] Modules linked in: osc(C) mgc(C) lustre(C) lmv(C) fld(C) mdc(C) 
fid(C) lov(C) ksocklnd(C) ptlrpc(C) obdclass(C) lnet(C) sha512_ssse3 
sha512_generic crc32_generic libcfs(C) joydev pcspkr i2c_piix4 virtio_console 
rpcsec_gss_krb5 ttm drm_kms_helper syscopyarea sysfillrect sysimgblt 
fb_sys_fops drm serio_raw virtio_blk floppy
[  186.633238] CPU: 2 PID: 10897 Comm: sh Tainted: G C  
4.13.0-rc1-vm-nfs+ #154
[  186.633990] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  186.634402] task: 8800d6a1c180 task.stack: c900066a
[  186.634807] RIP: 0010:__lock_acquire+0x125/0x1370
[  186.635206] RSP: 0018:c900066a3750 EFLAGS: 00010002
[  186.635597] RAX: 0046 RBX: 0001 RCX: 
[  186.636013] RDX: 0001 RSI:  RDI: 
[  186.636432] RBP: c900066a3810 R08: a05221a9 R09: 
[  186.636844] R10:  R11: 8800d6a1c180 R12: 0001
[  186.637264] R13:  R14: 0001 R15: 00a8
[  186.637679] FS:  7fd679eaa700() GS:88011a20() 
knlGS:
[  186.638402] CS:  0010 DS:  ES:  CR0: 80050033
[  186.638803] CR2: 00a8 CR3: 000109c0b000 CR4: 06e0
[  186.639228] Call Trace:
[  186.639589]  lock_acquire+0xe3/0x1d0
[  186.639962]  ? lock_acquire+0xe3/0x1d0
[  186.640352]  ? ll_lookup_it_finish+0x379/0xca0 [lustre]
[  186.640747]  _raw_spin_lock+0x34/0x70
[  186.641130]  ? ll_lookup_it_finish+0x379/0xca0 [lustre]
[  186.641529]  ll_lookup_it_finish+0x379/0xca0 [lustre]
[  186.641943]  ? req_capsule_server_get+0x15/0x20 [ptlrpc]
[  186.642368]  ? lmv_revalidate_slaves+0x790/0x790 [lmv]
[  186.642779]  ll_lookup_it+0x26d/0x820 [lustre]
[  186.643175]  ll_lookup_nd+0x162/0x1a0 [lustre]
[  186.643575]  lookup_slow+0x132/0x220
[  186.643947]  ? __wake_up+0x23/0x50
[  186.644322]  walk_component+0x1bf/0x350
[  186.644714]  link_path_walk+0x1b8/0x630
[  186.645097]  path_lookupat+0x99/0x220
[  186.645459]  ? __kernel_map_pages+0x131/0x140
[  186.645830]  ? __kernel_map_pages+0x131/0x140
[  186.646206]  filename_lookup+0xb8/0x1a0
[  186.646575]  ? __check_object_size+0xb1/0x1a0
[  186.646952]  ? strncpy_from_user+0x4d/0x160
[  186.647326]  user_path_at_empty+0x36/0x40
[  186.647694]  ? user_path_at_empty+0x36/0x40
[  186.648068]  vfs_statx+0x76/0xe0
[  186.648429]  SYSC_newstat+0x3d/0x70
[  186.648789]  ? trace_hardirqs_on_caller+0xf4/0x190
[  186.649171]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  186.649547]  SyS_newstat+0xe/0x10
[  186.649906]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[  186.650289] RIP: 0033:0x7fd679599475
[  186.650651] RSP: 002b:7ffc2e568fc8 EFLAGS: 0246 ORIG_RAX: 
0004
[  186.651350] RAX: ffda RBX: 7fd679862ae0 RCX: 7fd679599475
[  186.651758] RDX: 7ffc2e568fe0 RSI: 7ffc2e568fe0 RDI: 00eec2877730
[  186.652171] RBP: 7fd679862ae0 R08: 00eec2cafcb0 R09: 0008
[  186.652579] R10: 00eec2cafcb0 R11: 0246 R12: 0020
[  186.652982] R13: 00eec2cc7e10 R14: 00eec2cb2190 R15: 7ffc2e5690f8
[  186.653393] Code: c8 65 48 33 3c 25 28 00 00 00 44 89 e0 0f 85 91 0d 00 00 
48 81 c4 90 00 00 00 5b 41 5a 41 5c 41 5d 41 5e 41 5f 5d 49 8d 62 f8 c3 <49> 81 
3f 00 90 6f 82 41 bc 00 00 00 00 44 0f 45 e2 83 fe 01 0f 
[  186.654536] RIP: __lock_acquire+0x125/0x1370 RSP: c900066a3750
[  186.654933] CR2: 00a8


On Jul 18, 2017, at 7:26 PM, NeilBrown wrote:

> 1/ The testing of DCACHE_DISCONNECTED is wrong.
>  see upstream commit da093a9b76ef ("dcache: d_splice_alias should
>  ignore DCACHE_DISCONNECTED")
> 
>  As this is a notoriously difficult piece of code to get right,
>  it makes sense to use d_splice_alias() directly and no try to
>  create a local version of it.
> 
> 2/ ll_find_alias() currently:
> locks and alias
> checks that it is the one we want
> unlock it
> locks it again
> gets a reference
> unlocks it
> 

Re: [PATCH][V2] staging: lustre: fix spelling mistake, "grranted" -> "granted"

2017-07-14 Thread Oleg Drokin

On Jul 14, 2017, at 5:33 PM, Colin King wrote:

> From: Colin Ian King <colin.k...@canonical.com>
> 
> Trivial fix to spelling mistake in CERROR error message. Also
> clean up the grammar.
> 
> Signed-off-by: Colin Ian King <colin.k...@canonical.com>

Reviewed-by: Oleg Drokin <oleg.dro...@intel.com>

> ---
> drivers/staging/lustre/lustre/ptlrpc/import.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c 
> b/drivers/staging/lustre/lustre/ptlrpc/import.c
> index 52cb1f0c9c94..b19dac15e901 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/import.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
> @@ -1026,7 +1026,7 @@ static int ptlrpc_connect_interpret(const struct lu_env 
> *env,
>   /* check that server granted subset of flags we asked for. */
>   if ((ocd->ocd_connect_flags & imp->imp_connect_flags_orig) !=
>   ocd->ocd_connect_flags) {
> - CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx grranted=%#llx\n",
> + CERROR("%s: Server didn't grant the asked for subset of flags: 
> asked=%#llx granted=%#llx\n",
>  imp->imp_obd->obd_name, imp->imp_connect_flags_orig,
>  ocd->ocd_connect_flags);
>   rc = -EPROTO;
> -- 
> 2.11.0



Re: [PATCH][V2] staging: lustre: fix spelling mistake, "grranted" -> "granted"

2017-07-14 Thread Oleg Drokin

On Jul 14, 2017, at 5:33 PM, Colin King wrote:

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in CERROR error message. Also
> clean up the grammar.
> 
> Signed-off-by: Colin Ian King 

Reviewed-by: Oleg Drokin 

> ---
> drivers/staging/lustre/lustre/ptlrpc/import.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c 
> b/drivers/staging/lustre/lustre/ptlrpc/import.c
> index 52cb1f0c9c94..b19dac15e901 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/import.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
> @@ -1026,7 +1026,7 @@ static int ptlrpc_connect_interpret(const struct lu_env 
> *env,
>   /* check that server granted subset of flags we asked for. */
>   if ((ocd->ocd_connect_flags & imp->imp_connect_flags_orig) !=
>   ocd->ocd_connect_flags) {
> - CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx grranted=%#llx\n",
> + CERROR("%s: Server didn't grant the asked for subset of flags: 
> asked=%#llx granted=%#llx\n",
>  imp->imp_obd->obd_name, imp->imp_connect_flags_orig,
>  ocd->ocd_connect_flags);
>   rc = -EPROTO;
> -- 
> 2.11.0



Re: [PATCH] staging: lustre: fix spelling mistake, "grranted" -> "granted"

2017-07-14 Thread Oleg Drokin

On Jul 14, 2017, at 9:26 AM, Colin King wrote:

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in CERROR error message
> 
> Signed-off-by: Colin Ian King 
> ---
> drivers/staging/lustre/lustre/ptlrpc/import.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c 
> b/drivers/staging/lustre/lustre/ptlrpc/import.c
> index 52cb1f0c9c94..99877aa10d6f 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/import.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
> @@ -1026,7 +1026,7 @@ static int ptlrpc_connect_interpret(const struct lu_env 
> *env,
>   /* check that server granted subset of flags we asked for. */
>   if ((ocd->ocd_connect_flags & imp->imp_connect_flags_orig) !=
>   ocd->ocd_connect_flags) {
> - CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx grranted=%#llx\n",
> + CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx granted=%#llx\n",

While we are at it can also address grammar problem: "didn't granted" -> 
"didn't grant"?

Thanks!

>  imp->imp_obd->obd_name, imp->imp_connect_flags_orig,
>  ocd->ocd_connect_flags);
>   rc = -EPROTO;
> -- 
> 2.11.0



Re: [PATCH] staging: lustre: fix spelling mistake, "grranted" -> "granted"

2017-07-14 Thread Oleg Drokin

On Jul 14, 2017, at 9:26 AM, Colin King wrote:

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in CERROR error message
> 
> Signed-off-by: Colin Ian King 
> ---
> drivers/staging/lustre/lustre/ptlrpc/import.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c 
> b/drivers/staging/lustre/lustre/ptlrpc/import.c
> index 52cb1f0c9c94..99877aa10d6f 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/import.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
> @@ -1026,7 +1026,7 @@ static int ptlrpc_connect_interpret(const struct lu_env 
> *env,
>   /* check that server granted subset of flags we asked for. */
>   if ((ocd->ocd_connect_flags & imp->imp_connect_flags_orig) !=
>   ocd->ocd_connect_flags) {
> - CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx grranted=%#llx\n",
> + CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx granted=%#llx\n",

While we are at it can also address grammar problem: "didn't granted" -> 
"didn't grant"?

Thanks!

>  imp->imp_obd->obd_name, imp->imp_connect_flags_orig,
>  ocd->ocd_connect_flags);
>   rc = -EPROTO;
> -- 
> 2.11.0



Re: [PATCH] stating: lustre: fix sparse error: incompatible types in comparison expression

2017-07-12 Thread Oleg Drokin

On Jul 12, 2017, at 10:10 PM, Rui Teng wrote:

> Comparing two user space addresses to avoid sparse error:
> 
> drivers/staging//lustre/lnet/selftest/conrpc.c:490:30: error:
> incompatible types in comparison expression (different address spaces)
> 
> Signed-off-by: Rui Teng <rui.t...@linux.vnet.ibm.com>
> ---
> drivers/staging/lustre/lnet/selftest/conrpc.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c 
> b/drivers/staging/lustre/lnet/selftest/conrpc.c
> index da36c55b86d3..ae7c2772825e 100644
> --- a/drivers/staging/lustre/lnet/selftest/conrpc.c
> +++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
> @@ -487,10 +487,9 @@ lstcon_rpc_trans_interpreter(struct lstcon_rpc_trans 
> *trans,
>  sizeof(struct list_head)))
>   return -EFAULT;
> 
> - if (tmp.next == head_up)
> - return 0;
> -
>   next = tmp.next;

So the assignment is fine, but comparison is not? Strange.

I guess this is fine by me if that makes the warning go away.

Acked-by: Oleg Drokin <oleg.dro...@intel.com>

> + if (next == head_up)
> + return 0;
> 
>   ent = list_entry(next, struct lstcon_rpc_ent, rpe_link);
> 
> -- 
> 2.11.0



Re: [PATCH] stating: lustre: fix sparse error: incompatible types in comparison expression

2017-07-12 Thread Oleg Drokin

On Jul 12, 2017, at 10:10 PM, Rui Teng wrote:

> Comparing two user space addresses to avoid sparse error:
> 
> drivers/staging//lustre/lnet/selftest/conrpc.c:490:30: error:
> incompatible types in comparison expression (different address spaces)
> 
> Signed-off-by: Rui Teng 
> ---
> drivers/staging/lustre/lnet/selftest/conrpc.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c 
> b/drivers/staging/lustre/lnet/selftest/conrpc.c
> index da36c55b86d3..ae7c2772825e 100644
> --- a/drivers/staging/lustre/lnet/selftest/conrpc.c
> +++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
> @@ -487,10 +487,9 @@ lstcon_rpc_trans_interpreter(struct lstcon_rpc_trans 
> *trans,
>  sizeof(struct list_head)))
>   return -EFAULT;
> 
> - if (tmp.next == head_up)
> - return 0;
> -
>   next = tmp.next;

So the assignment is fine, but comparison is not? Strange.

I guess this is fine by me if that makes the warning go away.

Acked-by: Oleg Drokin 

> + if (next == head_up)
> + return 0;
> 
>   ent = list_entry(next, struct lstcon_rpc_ent, rpe_link);
> 
> -- 
> 2.11.0



Re: [PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_exit

2017-05-31 Thread Oleg Drokin
Hello!

On May 31, 2017, at 4:00 AM, Jia-Ju Bai wrote:

> The driver may sleep under a spin lock, and the function call path is:
> cfs_wi_exit (acquire the lock by spin_lock)
>  LASSERT
>lbug_with_loc
>  libcfs_debug_dumplog
>schedule and kthread_run --> may sleep
> 
> To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.
> 
> Signed-off-by: Jia-Ju Bai 
> ---
> drivers/staging/lustre/lnet/libcfs/workitem.c |   13 +++--
> 1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
> b/drivers/staging/lustre/lnet/libcfs/workitem.c
> index dbc2a9b..928d06d 100644
> --- a/drivers/staging/lustre/lnet/libcfs/workitem.c
> +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
> @@ -111,22 +111,23 @@ struct cfs_wi_sched {
> {
>   LASSERT(!in_interrupt()); /* because we use plain spinlock */
>   LASSERT(!sched->ws_stopping);
> + LASSERT(wi->wi_running);
> + if (wi->wi_scheduled) {
> + LASSERT(!list_empty(>wi_list));
> + LASSERT(sched->ws_nscheduled > 0);
> + }

Similarly here and in all other patches about LASSERT calls under spinlocks() 
from you,
just think of them as a panic() call, no operations are expected to continue
after it triggers.

Thanks.

> 
>   spin_lock(>ws_lock);
> 
> - LASSERT(wi->wi_running);
>   if (wi->wi_scheduled) { /* cancel pending schedules */
> - LASSERT(!list_empty(>wi_list));
>   list_del_init(>wi_list);
> -
> - LASSERT(sched->ws_nscheduled > 0);
>   sched->ws_nscheduled--;
>   }
> 
> - LASSERT(list_empty(>wi_list));
> -
>   wi->wi_scheduled = 1; /* LBUG future schedule attempts */
>   spin_unlock(>ws_lock);
> +
> + LASSERT(list_empty(>wi_list));
> }
> EXPORT_SYMBOL(cfs_wi_exit);
> 
> -- 
> 1.7.9.5
> 



Re: [PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_exit

2017-05-31 Thread Oleg Drokin
Hello!

On May 31, 2017, at 4:00 AM, Jia-Ju Bai wrote:

> The driver may sleep under a spin lock, and the function call path is:
> cfs_wi_exit (acquire the lock by spin_lock)
>  LASSERT
>lbug_with_loc
>  libcfs_debug_dumplog
>schedule and kthread_run --> may sleep
> 
> To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.
> 
> Signed-off-by: Jia-Ju Bai 
> ---
> drivers/staging/lustre/lnet/libcfs/workitem.c |   13 +++--
> 1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
> b/drivers/staging/lustre/lnet/libcfs/workitem.c
> index dbc2a9b..928d06d 100644
> --- a/drivers/staging/lustre/lnet/libcfs/workitem.c
> +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
> @@ -111,22 +111,23 @@ struct cfs_wi_sched {
> {
>   LASSERT(!in_interrupt()); /* because we use plain spinlock */
>   LASSERT(!sched->ws_stopping);
> + LASSERT(wi->wi_running);
> + if (wi->wi_scheduled) {
> + LASSERT(!list_empty(>wi_list));
> + LASSERT(sched->ws_nscheduled > 0);
> + }

Similarly here and in all other patches about LASSERT calls under spinlocks() 
from you,
just think of them as a panic() call, no operations are expected to continue
after it triggers.

Thanks.

> 
>   spin_lock(>ws_lock);
> 
> - LASSERT(wi->wi_running);
>   if (wi->wi_scheduled) { /* cancel pending schedules */
> - LASSERT(!list_empty(>wi_list));
>   list_del_init(>wi_list);
> -
> - LASSERT(sched->ws_nscheduled > 0);
>   sched->ws_nscheduled--;
>   }
> 
> - LASSERT(list_empty(>wi_list));
> -
>   wi->wi_scheduled = 1; /* LBUG future schedule attempts */
>   spin_unlock(>ws_lock);
> +
> + LASSERT(list_empty(>wi_list));
> }
> EXPORT_SYMBOL(cfs_wi_exit);
> 
> -- 
> 1.7.9.5
> 



Re: [PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_deschedule

2017-05-31 Thread Oleg Drokin
Hello!

On May 31, 2017, at 3:57 AM, Jia-Ju Bai wrote:

> The driver may sleep under a spin lock, and the function call path is:
> cfs_wi_deschedule (acquire the lock by spin_lock)
>  LASSERT
>lbug_with_loc
>  libcfs_debug_dumplog
>schedule and kthread_run --> may sleep
> 
> To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.
> 
> Signed-off-by: Jia-Ju Bai 
> ---
> drivers/staging/lustre/lnet/libcfs/workitem.c |   12 ++--
> 1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
> b/drivers/staging/lustre/lnet/libcfs/workitem.c
> index dbc2a9b..9c530cf 100644
> --- a/drivers/staging/lustre/lnet/libcfs/workitem.c
> +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
> @@ -140,6 +140,10 @@ struct cfs_wi_sched {
> 
>   LASSERT(!in_interrupt()); /* because we use plain spinlock */
>   LASSERT(!sched->ws_stopping);
> + if (wi->wi_scheduled) {
> + LASSERT(!list_empty(>wi_list));
> + LASSERT(sched->ws_nscheduled > 0);
> + }

I don't think you can do this,
this was under spinlock because those values could change from a different 
thread
and we do need to look at them all together.

You are correct that LASSET/LBUG might schedule to dump a debug log
into a file and even if not it does sleep indefinitely after that.
But in reality the default option is "panic_on_lbug=1" which simply converts
LASSERT() into panic().

This is certainly not a normal condition and as such I think we can leave the 
code
as is.

Thanks.


> 
>   /*
>* return 0 if it's running already, otherwise return 1, which
> @@ -151,18 +155,14 @@ struct cfs_wi_sched {
>   rc = !(wi->wi_running);
> 
>   if (wi->wi_scheduled) { /* cancel pending schedules */
> - LASSERT(!list_empty(>wi_list));
>   list_del_init(>wi_list);
> -
> - LASSERT(sched->ws_nscheduled > 0);
>   sched->ws_nscheduled--;
> -
>   wi->wi_scheduled = 0;
>   }
> 
> - LASSERT(list_empty(>wi_list));
> -
>   spin_unlock(>ws_lock);
> +
> + LASSERT(list_empty(>wi_list));
>   return rc;
> }
> EXPORT_SYMBOL(cfs_wi_deschedule);
> -- 
> 1.7.9.5
> 



Re: [PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_deschedule

2017-05-31 Thread Oleg Drokin
Hello!

On May 31, 2017, at 3:57 AM, Jia-Ju Bai wrote:

> The driver may sleep under a spin lock, and the function call path is:
> cfs_wi_deschedule (acquire the lock by spin_lock)
>  LASSERT
>lbug_with_loc
>  libcfs_debug_dumplog
>schedule and kthread_run --> may sleep
> 
> To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.
> 
> Signed-off-by: Jia-Ju Bai 
> ---
> drivers/staging/lustre/lnet/libcfs/workitem.c |   12 ++--
> 1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
> b/drivers/staging/lustre/lnet/libcfs/workitem.c
> index dbc2a9b..9c530cf 100644
> --- a/drivers/staging/lustre/lnet/libcfs/workitem.c
> +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
> @@ -140,6 +140,10 @@ struct cfs_wi_sched {
> 
>   LASSERT(!in_interrupt()); /* because we use plain spinlock */
>   LASSERT(!sched->ws_stopping);
> + if (wi->wi_scheduled) {
> + LASSERT(!list_empty(>wi_list));
> + LASSERT(sched->ws_nscheduled > 0);
> + }

I don't think you can do this,
this was under spinlock because those values could change from a different 
thread
and we do need to look at them all together.

You are correct that LASSET/LBUG might schedule to dump a debug log
into a file and even if not it does sleep indefinitely after that.
But in reality the default option is "panic_on_lbug=1" which simply converts
LASSERT() into panic().

This is certainly not a normal condition and as such I think we can leave the 
code
as is.

Thanks.


> 
>   /*
>* return 0 if it's running already, otherwise return 1, which
> @@ -151,18 +155,14 @@ struct cfs_wi_sched {
>   rc = !(wi->wi_running);
> 
>   if (wi->wi_scheduled) { /* cancel pending schedules */
> - LASSERT(!list_empty(>wi_list));
>   list_del_init(>wi_list);
> -
> - LASSERT(sched->ws_nscheduled > 0);
>   sched->ws_nscheduled--;
> -
>   wi->wi_scheduled = 0;
>   }
> 
> - LASSERT(list_empty(>wi_list));
> -
>   spin_unlock(>ws_lock);
> +
> + LASSERT(list_empty(>wi_list));
>   return rc;
> }
> EXPORT_SYMBOL(cfs_wi_deschedule);
> -- 
> 1.7.9.5
> 



Re: [PATCH] staging/lustre/lov: remove set_fs() call from lov_getstripe()

2017-05-29 Thread Oleg Drokin

On May 29, 2017, at 10:28 AM, Greg Kroah-Hartman wrote:

> On Fri, May 26, 2017 at 11:40:33PM -0400, Oleg Drokin wrote:
>> lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
>> lov_user_md pointer from user- or kernel-space.  This changes the
>> behavior of copy_from_user() on SPARC and may result in a misaligned
>> access exception which in turn oopses the kernel.  In fact the
>> relevant argument to lov_getstripe() is never called with a
>> kernel-space pointer and so changing the address limits is unnecessary
>> and so we remove the calls to save, set, and restore the address
>> limits.
>> 
>> Signed-off-by: John L. Hammond <john.hamm...@intel.com>
>> Reviewed-on: http://review.whamcloud.com/6150
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
>> Reviewed-by: Andreas Dilger <andreas.dil...@intel.com>
>> Reviewed-by: Li Wei <wei.g...@intel.com>
>> Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
>> ---
>> drivers/staging/lustre/lustre/lov/lov_pack.c | 9 -
>> 1 file changed, 9 deletions(-)
> 
> So is this the patch that you want applied to the staging tree(s) as
> well?  If so, please let me know, otherwise I have no clue…

Yes, this is it.
Thanks!



Re: [PATCH] staging/lustre/lov: remove set_fs() call from lov_getstripe()

2017-05-29 Thread Oleg Drokin

On May 29, 2017, at 10:28 AM, Greg Kroah-Hartman wrote:

> On Fri, May 26, 2017 at 11:40:33PM -0400, Oleg Drokin wrote:
>> lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
>> lov_user_md pointer from user- or kernel-space.  This changes the
>> behavior of copy_from_user() on SPARC and may result in a misaligned
>> access exception which in turn oopses the kernel.  In fact the
>> relevant argument to lov_getstripe() is never called with a
>> kernel-space pointer and so changing the address limits is unnecessary
>> and so we remove the calls to save, set, and restore the address
>> limits.
>> 
>> Signed-off-by: John L. Hammond 
>> Reviewed-on: http://review.whamcloud.com/6150
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
>> Reviewed-by: Andreas Dilger 
>> Reviewed-by: Li Wei 
>> Signed-off-by: Oleg Drokin 
>> ---
>> drivers/staging/lustre/lustre/lov/lov_pack.c | 9 -
>> 1 file changed, 9 deletions(-)
> 
> So is this the patch that you want applied to the staging tree(s) as
> well?  If so, please let me know, otherwise I have no clue…

Yes, this is it.
Thanks!



[PATCH] staging/lustre/lov: remove set_fs() call from lov_getstripe()

2017-05-26 Thread Oleg Drokin
lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
lov_user_md pointer from user- or kernel-space.  This changes the
behavior of copy_from_user() on SPARC and may result in a misaligned
access exception which in turn oopses the kernel.  In fact the
relevant argument to lov_getstripe() is never called with a
kernel-space pointer and so changing the address limits is unnecessary
and so we remove the calls to save, set, and restore the address
limits.

Signed-off-by: John L. Hammond <john.hamm...@intel.com>
Reviewed-on: http://review.whamcloud.com/6150
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
Reviewed-by: Andreas Dilger <andreas.dil...@intel.com>
Reviewed-by: Li Wei <wei.g...@intel.com>
Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/lov/lov_pack.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c 
b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 2e1bd47..e6727ce 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -293,18 +293,10 @@ int lov_getstripe(struct lov_object *obj, struct 
lov_stripe_md *lsm,
size_t lmmk_size;
size_t lum_size;
int rc;
-   mm_segment_t seg;
 
if (!lsm)
return -ENODATA;
 
-   /*
-* "Switch to kernel segment" to allow copying from kernel space by
-* copy_{to,from}_user().
-*/
-   seg = get_fs();
-   set_fs(KERNEL_DS);
-
if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
CERROR("bad LSM MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
   lsm->lsm_magic, LOV_MAGIC_V1, LOV_MAGIC_V3);
@@ -406,6 +398,5 @@ int lov_getstripe(struct lov_object *obj, struct 
lov_stripe_md *lsm,
 out_free:
kvfree(lmmk);
 out:
-   set_fs(seg);
return rc;
 }
-- 
2.9.3



[PATCH] staging/lustre/lov: remove set_fs() call from lov_getstripe()

2017-05-26 Thread Oleg Drokin
lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
lov_user_md pointer from user- or kernel-space.  This changes the
behavior of copy_from_user() on SPARC and may result in a misaligned
access exception which in turn oopses the kernel.  In fact the
relevant argument to lov_getstripe() is never called with a
kernel-space pointer and so changing the address limits is unnecessary
and so we remove the calls to save, set, and restore the address
limits.

Signed-off-by: John L. Hammond 
Reviewed-on: http://review.whamcloud.com/6150
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
Reviewed-by: Andreas Dilger 
Reviewed-by: Li Wei 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/lov/lov_pack.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c 
b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 2e1bd47..e6727ce 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -293,18 +293,10 @@ int lov_getstripe(struct lov_object *obj, struct 
lov_stripe_md *lsm,
size_t lmmk_size;
size_t lum_size;
int rc;
-   mm_segment_t seg;
 
if (!lsm)
return -ENODATA;
 
-   /*
-* "Switch to kernel segment" to allow copying from kernel space by
-* copy_{to,from}_user().
-*/
-   seg = get_fs();
-   set_fs(KERNEL_DS);
-
if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
CERROR("bad LSM MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
   lsm->lsm_magic, LOV_MAGIC_V1, LOV_MAGIC_V3);
@@ -406,6 +398,5 @@ int lov_getstripe(struct lov_object *obj, struct 
lov_stripe_md *lsm,
 out_free:
kvfree(lmmk);
 out:
-   set_fs(seg);
return rc;
 }
-- 
2.9.3



Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin

On Mar 19, 2017, at 12:41 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>> Ever since sysfs migration, class_process_proc_param stopped working
>> correctly as all the useful params were no longer present as lvars.
>> Replace all the nasty fake proc writes with hopefully less nasty
>> kobject attribute search and then update the attributes as needed.
>> 
>> Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
>> Reported-by: Al Viro <v...@zeniv.linux.org.uk>
>> ---
>> Al has quite rightfully complained in the past that class_process_proc_param
>> is a terrible piece of code and needs to go.
>> This patch is an attempt at improving it somewhat and in process drop
>> all the user/kernel address space games we needed to play to make it work
>> in the past (and which I suspect attracted Al's attention in the first 
>> place).
>> 
>> Now I wonder if iterating kobject attributes like that would be ok with
>> you Greg, or do you think there is a better way?
>> class_find_write_attr could be turned into something generic since it's
>> certainly convenient to reuse same table of name-write_method pairs,
>> but I did some cursory research and nobody else seems to need anything
>> of the sort in-tree.
>> 
>> I know ll_process_config is still awful and I will likely just
>> replace the current hack with kset_find_obj, but I just wanted to make
>> sure this new approach would be ok before spending too much time on it.
>> 
>> Thanks!
>> 
>> drivers/staging/lustre/lustre/include/obd_class.h  |  4 +-
>> drivers/staging/lustre/lustre/llite/llite_lib.c| 10 +--
>> drivers/staging/lustre/lustre/lov/lov_obd.c|  3 +-
>> drivers/staging/lustre/lustre/mdc/mdc_request.c|  3 +-
>> .../staging/lustre/lustre/obdclass/obd_config.c| 78 
>> ++
>> drivers/staging/lustre/lustre/osc/osc_request.c|  3 +-
>> 6 files changed, 44 insertions(+), 57 deletions(-)
>> 
>> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
>> b/drivers/staging/lustre/lustre/include/obd_class.h
>> index 083a6ff..badafb8 100644
>> --- a/drivers/staging/lustre/lustre/include/obd_class.h
>> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
>> @@ -114,8 +114,8 @@ typedef int (*llog_cb_t)(const struct lu_env *, struct 
>> llog_handle *,
>>   struct llog_rec_hdr *, void *);
>> /* obd_config.c */
>> int class_process_config(struct lustre_cfg *lcfg);
>> -int class_process_proc_param(char *prefix, struct lprocfs_vars *lvars,
>> - struct lustre_cfg *lcfg, void *data);
>> +int class_process_attr_param(char *prefix, struct kobject *kobj,
>> + struct lustre_cfg *lcfg);
> 
> As you are exporting these functions, they will need to end up with a
> lustre_* prefix eventually :)

ok.

> 
>> struct obd_device *class_incref(struct obd_device *obd,
>>  const char *scope, const void *source);
>> void class_decref(struct obd_device *obd,
>> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
>> b/drivers/staging/lustre/lustre/llite/llite_lib.c
>> index 7b80040..192b877 100644
>> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c
>> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
>> @@ -2259,7 +2259,7 @@ int ll_obd_statfs(struct inode *inode, void __user 
>> *arg)
>> int ll_process_config(struct lustre_cfg *lcfg)
>> {
>>  char *ptr;
>> -void *sb;
>> +struct super_block *sb;
>>  struct lprocfs_static_vars lvars;
>>  unsigned long x;
>>  int rc = 0;
>> @@ -2273,15 +2273,15 @@ int ll_process_config(struct lustre_cfg *lcfg)
>>  rc = kstrtoul(ptr, 16, );
>>  if (rc != 0)
>>  return -EINVAL;
>> -sb = (void *)x;
>> +sb = (struct super_block *)x;
>>  /* This better be a real Lustre superblock! */
>> -LASSERT(s2lsi((struct super_block *)sb)->lsi_lmd->lmd_magic == 
>> LMD_MAGIC);
>> +LASSERT(s2lsi(sb)->lsi_lmd->lmd_magic == LMD_MAGIC);
>> 
>>  /* Note we have not called client_common_fill_super yet, so
>>   * proc fns must be able to handle that!
>>   */
>> -rc = class_process_proc_param(PARAM_LLITE, lvars.obd_vars,
>> -  lcfg, sb);
>> +rc = class_process_attr_param(PARAM_LLITE, _s2sbi(sb)->ll_kobj,
>> +  lcfg);
>>  if (rc > 0)
>> 

Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin

On Mar 19, 2017, at 12:41 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>> Ever since sysfs migration, class_process_proc_param stopped working
>> correctly as all the useful params were no longer present as lvars.
>> Replace all the nasty fake proc writes with hopefully less nasty
>> kobject attribute search and then update the attributes as needed.
>> 
>> Signed-off-by: Oleg Drokin 
>> Reported-by: Al Viro 
>> ---
>> Al has quite rightfully complained in the past that class_process_proc_param
>> is a terrible piece of code and needs to go.
>> This patch is an attempt at improving it somewhat and in process drop
>> all the user/kernel address space games we needed to play to make it work
>> in the past (and which I suspect attracted Al's attention in the first 
>> place).
>> 
>> Now I wonder if iterating kobject attributes like that would be ok with
>> you Greg, or do you think there is a better way?
>> class_find_write_attr could be turned into something generic since it's
>> certainly convenient to reuse same table of name-write_method pairs,
>> but I did some cursory research and nobody else seems to need anything
>> of the sort in-tree.
>> 
>> I know ll_process_config is still awful and I will likely just
>> replace the current hack with kset_find_obj, but I just wanted to make
>> sure this new approach would be ok before spending too much time on it.
>> 
>> Thanks!
>> 
>> drivers/staging/lustre/lustre/include/obd_class.h  |  4 +-
>> drivers/staging/lustre/lustre/llite/llite_lib.c| 10 +--
>> drivers/staging/lustre/lustre/lov/lov_obd.c|  3 +-
>> drivers/staging/lustre/lustre/mdc/mdc_request.c|  3 +-
>> .../staging/lustre/lustre/obdclass/obd_config.c| 78 
>> ++
>> drivers/staging/lustre/lustre/osc/osc_request.c|  3 +-
>> 6 files changed, 44 insertions(+), 57 deletions(-)
>> 
>> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
>> b/drivers/staging/lustre/lustre/include/obd_class.h
>> index 083a6ff..badafb8 100644
>> --- a/drivers/staging/lustre/lustre/include/obd_class.h
>> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
>> @@ -114,8 +114,8 @@ typedef int (*llog_cb_t)(const struct lu_env *, struct 
>> llog_handle *,
>>   struct llog_rec_hdr *, void *);
>> /* obd_config.c */
>> int class_process_config(struct lustre_cfg *lcfg);
>> -int class_process_proc_param(char *prefix, struct lprocfs_vars *lvars,
>> - struct lustre_cfg *lcfg, void *data);
>> +int class_process_attr_param(char *prefix, struct kobject *kobj,
>> + struct lustre_cfg *lcfg);
> 
> As you are exporting these functions, they will need to end up with a
> lustre_* prefix eventually :)

ok.

> 
>> struct obd_device *class_incref(struct obd_device *obd,
>>  const char *scope, const void *source);
>> void class_decref(struct obd_device *obd,
>> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
>> b/drivers/staging/lustre/lustre/llite/llite_lib.c
>> index 7b80040..192b877 100644
>> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c
>> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
>> @@ -2259,7 +2259,7 @@ int ll_obd_statfs(struct inode *inode, void __user 
>> *arg)
>> int ll_process_config(struct lustre_cfg *lcfg)
>> {
>>  char *ptr;
>> -void *sb;
>> +struct super_block *sb;
>>  struct lprocfs_static_vars lvars;
>>  unsigned long x;
>>  int rc = 0;
>> @@ -2273,15 +2273,15 @@ int ll_process_config(struct lustre_cfg *lcfg)
>>  rc = kstrtoul(ptr, 16, );
>>  if (rc != 0)
>>  return -EINVAL;
>> -sb = (void *)x;
>> +sb = (struct super_block *)x;
>>  /* This better be a real Lustre superblock! */
>> -LASSERT(s2lsi((struct super_block *)sb)->lsi_lmd->lmd_magic == 
>> LMD_MAGIC);
>> +LASSERT(s2lsi(sb)->lsi_lmd->lmd_magic == LMD_MAGIC);
>> 
>>  /* Note we have not called client_common_fill_super yet, so
>>   * proc fns must be able to handle that!
>>   */
>> -rc = class_process_proc_param(PARAM_LLITE, lvars.obd_vars,
>> -  lcfg, sb);
>> +rc = class_process_attr_param(PARAM_LLITE, _s2sbi(sb)->ll_kobj,
>> +  lcfg);
>>  if (rc > 0)
>>  rc = 0;
>>  return rc;
>> diff --git a/drivers/st

Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin

On Mar 19, 2017, at 12:29 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 11:17:55AM -0400, Oleg Drokin wrote:
>> 
>> On Mar 18, 2017, at 6:34 AM, Greg Kroah-Hartman wrote:
>> 
>>> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>>>> Ever since sysfs migration, class_process_proc_param stopped working
>>>> correctly as all the useful params were no longer present as lvars.
>>>> Replace all the nasty fake proc writes with hopefully less nasty
>>>> kobject attribute search and then update the attributes as needed.
>>>> 
>>>> Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
>>>> Reported-by: Al Viro <v...@zeniv.linux.org.uk>
>>>> ---
>>>> Al has quite rightfully complained in the past that 
>>>> class_process_proc_param
>>>> is a terrible piece of code and needs to go.
>>>> This patch is an attempt at improving it somewhat and in process drop
>>>> all the user/kernel address space games we needed to play to make it work
>>>> in the past (and which I suspect attracted Al's attention in the first 
>>>> place).
>>>> 
>>>> Now I wonder if iterating kobject attributes like that would be ok with
>>>> you Greg, or do you think there is a better way?
>>>> class_find_write_attr could be turned into something generic since it's
>>>> certainly convenient to reuse same table of name-write_method pairs,
>>>> but I did some cursory research and nobody else seems to need anything
>>>> of the sort in-tree.
>>>> 
>>>> I know ll_process_config is still awful and I will likely just
>>>> replace the current hack with kset_find_obj, but I just wanted to make
>>>> sure this new approach would be ok before spending too much time on it.
>>> 
>>> I'm not quite sure what exactly you are even trying to do here.  What is
>>> this interface?  Who calls it, and how?  What does it want to do?
>> 
>> This is a configuration client code.
>> Management server has ability to pass down config information in the form of:
>> fsname.subsystem.attribute=value to clients and other servers
>> (subsystem determines if it's something of interest of clients or servers or
>> both).
>> This could be changed in the real time - i.e. you update it on the server and
>> that gets propagated to all the clients/servers, so no need to ssh into
>> every node to manually apply those changes and worry about client restarts
>> (the config is remembered at the management server and would be applied to 
>> any
>> new nodes connecting/across server restarts and such).
>> 
>> So the way it then works then is once the string 
>> fsname.subsystem.attribute=value is delivered to the client, we find all 
>> instances of filesystem with fsname and then
>> all subsystems within it (one kobject per subsystem instance) and then 
>> update the
>> attributes to the value supplied.
>> 
>> The same filesystem might be mounted more than once and then some layers 
>> might have
>> multiple instances inside a single filesystems.
>> 
>> In the end it would turn something like lustre.osc.max_dirty_mb=128 into
>> writes to
>> /sys/fs/lustre/osc/lustre-OST-osc-8800d75ca000/max_dirty_mb and
>> /sys/fs/lustre/osc/lustre-OST0001-osc-8800d75ca000/max_dirty_mb
>> without actually iterating in sysfs namespace.
> 
> Wait, who is doing the "write"?  From within the kernel?  Or some
> userspace app?  I'm guessing from within the kernel, you are receiving
> the data from some other transport within the filesystem and then need
> to apply the settings?

Yes, kernel code gets the notification "hey, there's a config change, come get 
it",
then it requests the diff in the config and does the "write' by updating the
attributes.

>> The alternative we considered is we can probably just do an upcall and have
>> a userspace tool called with the parameter verbatim and try to figure it out,
>> but that seems a lot less ideal, and also we'll get a bunch of complications 
>> from
>> containers and such too, I imagine.
> 
> Yeah, no, don't do an upcall, that's a mess.
> 
>> The function pre-this patch is assuming that all these values are part of
>> a list of procfs values (no longer true after sysfs migration) so just 
>> iterates
>> that list and calls the write for matched names (but also needs to supply a 
>> userspace
>> buffer so looks much uglier too).
> 
> For kobjects, you don't need userspace b

Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin

On Mar 19, 2017, at 12:29 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 11:17:55AM -0400, Oleg Drokin wrote:
>> 
>> On Mar 18, 2017, at 6:34 AM, Greg Kroah-Hartman wrote:
>> 
>>> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>>>> Ever since sysfs migration, class_process_proc_param stopped working
>>>> correctly as all the useful params were no longer present as lvars.
>>>> Replace all the nasty fake proc writes with hopefully less nasty
>>>> kobject attribute search and then update the attributes as needed.
>>>> 
>>>> Signed-off-by: Oleg Drokin 
>>>> Reported-by: Al Viro 
>>>> ---
>>>> Al has quite rightfully complained in the past that 
>>>> class_process_proc_param
>>>> is a terrible piece of code and needs to go.
>>>> This patch is an attempt at improving it somewhat and in process drop
>>>> all the user/kernel address space games we needed to play to make it work
>>>> in the past (and which I suspect attracted Al's attention in the first 
>>>> place).
>>>> 
>>>> Now I wonder if iterating kobject attributes like that would be ok with
>>>> you Greg, or do you think there is a better way?
>>>> class_find_write_attr could be turned into something generic since it's
>>>> certainly convenient to reuse same table of name-write_method pairs,
>>>> but I did some cursory research and nobody else seems to need anything
>>>> of the sort in-tree.
>>>> 
>>>> I know ll_process_config is still awful and I will likely just
>>>> replace the current hack with kset_find_obj, but I just wanted to make
>>>> sure this new approach would be ok before spending too much time on it.
>>> 
>>> I'm not quite sure what exactly you are even trying to do here.  What is
>>> this interface?  Who calls it, and how?  What does it want to do?
>> 
>> This is a configuration client code.
>> Management server has ability to pass down config information in the form of:
>> fsname.subsystem.attribute=value to clients and other servers
>> (subsystem determines if it's something of interest of clients or servers or
>> both).
>> This could be changed in the real time - i.e. you update it on the server and
>> that gets propagated to all the clients/servers, so no need to ssh into
>> every node to manually apply those changes and worry about client restarts
>> (the config is remembered at the management server and would be applied to 
>> any
>> new nodes connecting/across server restarts and such).
>> 
>> So the way it then works then is once the string 
>> fsname.subsystem.attribute=value is delivered to the client, we find all 
>> instances of filesystem with fsname and then
>> all subsystems within it (one kobject per subsystem instance) and then 
>> update the
>> attributes to the value supplied.
>> 
>> The same filesystem might be mounted more than once and then some layers 
>> might have
>> multiple instances inside a single filesystems.
>> 
>> In the end it would turn something like lustre.osc.max_dirty_mb=128 into
>> writes to
>> /sys/fs/lustre/osc/lustre-OST-osc-8800d75ca000/max_dirty_mb and
>> /sys/fs/lustre/osc/lustre-OST0001-osc-8800d75ca000/max_dirty_mb
>> without actually iterating in sysfs namespace.
> 
> Wait, who is doing the "write"?  From within the kernel?  Or some
> userspace app?  I'm guessing from within the kernel, you are receiving
> the data from some other transport within the filesystem and then need
> to apply the settings?

Yes, kernel code gets the notification "hey, there's a config change, come get 
it",
then it requests the diff in the config and does the "write' by updating the
attributes.

>> The alternative we considered is we can probably just do an upcall and have
>> a userspace tool called with the parameter verbatim and try to figure it out,
>> but that seems a lot less ideal, and also we'll get a bunch of complications 
>> from
>> containers and such too, I imagine.
> 
> Yeah, no, don't do an upcall, that's a mess.
> 
>> The function pre-this patch is assuming that all these values are part of
>> a list of procfs values (no longer true after sysfs migration) so just 
>> iterates
>> that list and calls the write for matched names (but also needs to supply a 
>> userspace
>> buffer so looks much uglier too).
> 
> For kobjects, you don't need userspace buffers, so this should be
> easier.

Right, it's less of a mess 

Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin

On Mar 18, 2017, at 6:34 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>> Ever since sysfs migration, class_process_proc_param stopped working
>> correctly as all the useful params were no longer present as lvars.
>> Replace all the nasty fake proc writes with hopefully less nasty
>> kobject attribute search and then update the attributes as needed.
>> 
>> Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
>> Reported-by: Al Viro <v...@zeniv.linux.org.uk>
>> ---
>> Al has quite rightfully complained in the past that class_process_proc_param
>> is a terrible piece of code and needs to go.
>> This patch is an attempt at improving it somewhat and in process drop
>> all the user/kernel address space games we needed to play to make it work
>> in the past (and which I suspect attracted Al's attention in the first 
>> place).
>> 
>> Now I wonder if iterating kobject attributes like that would be ok with
>> you Greg, or do you think there is a better way?
>> class_find_write_attr could be turned into something generic since it's
>> certainly convenient to reuse same table of name-write_method pairs,
>> but I did some cursory research and nobody else seems to need anything
>> of the sort in-tree.
>> 
>> I know ll_process_config is still awful and I will likely just
>> replace the current hack with kset_find_obj, but I just wanted to make
>> sure this new approach would be ok before spending too much time on it.
> 
> I'm not quite sure what exactly you are even trying to do here.  What is
> this interface?  Who calls it, and how?  What does it want to do?

This is a configuration client code.
Management server has ability to pass down config information in the form of:
fsname.subsystem.attribute=value to clients and other servers
(subsystem determines if it's something of interest of clients or servers or
both).
This could be changed in the real time - i.e. you update it on the server and
that gets propagated to all the clients/servers, so no need to ssh into
every node to manually apply those changes and worry about client restarts
(the config is remembered at the management server and would be applied to any
new nodes connecting/across server restarts and such).

So the way it then works then is once the string 
fsname.subsystem.attribute=value is delivered to the client, we find all 
instances of filesystem with fsname and then
all subsystems within it (one kobject per subsystem instance) and then update 
the
attributes to the value supplied.

The same filesystem might be mounted more than once and then some layers might 
have
multiple instances inside a single filesystems.

In the end it would turn something like lustre.osc.max_dirty_mb=128 into
writes to
/sys/fs/lustre/osc/lustre-OST-osc-8800d75ca000/max_dirty_mb and
/sys/fs/lustre/osc/lustre-OST0001-osc-8800d75ca000/max_dirty_mb
without actually iterating in sysfs namespace.

The alternative we considered is we can probably just do an upcall and have
a userspace tool called with the parameter verbatim and try to figure it out,
but that seems a lot less ideal, and also we'll get a bunch of complications 
from
containers and such too, I imagine.

The function pre-this patch is assuming that all these values are part of
a list of procfs values (no longer true after sysfs migration) so just iterates
that list and calls the write for matched names (but also needs to supply a 
userspace
buffer so looks much uglier too).

Hopefully this makes at least some sense.

> You can look up attributes for a kobject easily in the show/store
> functions (and some drivers just have a generic one and then you look at
> the string to see which attribute you are wanting to reference.)  But
> you seem to be working backwards here, why do you have to look up a
> kobject?

But that leads to the need to list attribute names essentially twice:
once for the attributes list, once in the show/set function to figure
out how to deal with that name.

> What is wrong with the "normal" way to interact with kobject attributes
> from sysfs?
> 
> What does your "process proc" function do?  Where does it get called
> from?
> 
> totally confused,
> 
> greg k-h



Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin

On Mar 18, 2017, at 6:34 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>> Ever since sysfs migration, class_process_proc_param stopped working
>> correctly as all the useful params were no longer present as lvars.
>> Replace all the nasty fake proc writes with hopefully less nasty
>> kobject attribute search and then update the attributes as needed.
>> 
>> Signed-off-by: Oleg Drokin 
>> Reported-by: Al Viro 
>> ---
>> Al has quite rightfully complained in the past that class_process_proc_param
>> is a terrible piece of code and needs to go.
>> This patch is an attempt at improving it somewhat and in process drop
>> all the user/kernel address space games we needed to play to make it work
>> in the past (and which I suspect attracted Al's attention in the first 
>> place).
>> 
>> Now I wonder if iterating kobject attributes like that would be ok with
>> you Greg, or do you think there is a better way?
>> class_find_write_attr could be turned into something generic since it's
>> certainly convenient to reuse same table of name-write_method pairs,
>> but I did some cursory research and nobody else seems to need anything
>> of the sort in-tree.
>> 
>> I know ll_process_config is still awful and I will likely just
>> replace the current hack with kset_find_obj, but I just wanted to make
>> sure this new approach would be ok before spending too much time on it.
> 
> I'm not quite sure what exactly you are even trying to do here.  What is
> this interface?  Who calls it, and how?  What does it want to do?

This is a configuration client code.
Management server has ability to pass down config information in the form of:
fsname.subsystem.attribute=value to clients and other servers
(subsystem determines if it's something of interest of clients or servers or
both).
This could be changed in the real time - i.e. you update it on the server and
that gets propagated to all the clients/servers, so no need to ssh into
every node to manually apply those changes and worry about client restarts
(the config is remembered at the management server and would be applied to any
new nodes connecting/across server restarts and such).

So the way it then works then is once the string 
fsname.subsystem.attribute=value is delivered to the client, we find all 
instances of filesystem with fsname and then
all subsystems within it (one kobject per subsystem instance) and then update 
the
attributes to the value supplied.

The same filesystem might be mounted more than once and then some layers might 
have
multiple instances inside a single filesystems.

In the end it would turn something like lustre.osc.max_dirty_mb=128 into
writes to
/sys/fs/lustre/osc/lustre-OST-osc-8800d75ca000/max_dirty_mb and
/sys/fs/lustre/osc/lustre-OST0001-osc-8800d75ca000/max_dirty_mb
without actually iterating in sysfs namespace.

The alternative we considered is we can probably just do an upcall and have
a userspace tool called with the parameter verbatim and try to figure it out,
but that seems a lot less ideal, and also we'll get a bunch of complications 
from
containers and such too, I imagine.

The function pre-this patch is assuming that all these values are part of
a list of procfs values (no longer true after sysfs migration) so just iterates
that list and calls the write for matched names (but also needs to supply a 
userspace
buffer so looks much uglier too).

Hopefully this makes at least some sense.

> You can look up attributes for a kobject easily in the show/store
> functions (and some drivers just have a generic one and then you look at
> the string to see which attribute you are wanting to reference.)  But
> you seem to be working backwards here, why do you have to look up a
> kobject?

But that leads to the need to list attribute names essentially twice:
once for the attributes list, once in the show/set function to figure
out how to deal with that name.

> What is wrong with the "normal" way to interact with kobject attributes
> from sysfs?
> 
> What does your "process proc" function do?  Where does it get called
> from?
> 
> totally confused,
> 
> greg k-h



[PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin
Ever since sysfs migration, class_process_proc_param stopped working
correctly as all the useful params were no longer present as lvars.
Replace all the nasty fake proc writes with hopefully less nasty
kobject attribute search and then update the attributes as needed.

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
Reported-by: Al Viro <v...@zeniv.linux.org.uk>
---
Al has quite rightfully complained in the past that class_process_proc_param
is a terrible piece of code and needs to go.
This patch is an attempt at improving it somewhat and in process drop
all the user/kernel address space games we needed to play to make it work
in the past (and which I suspect attracted Al's attention in the first place).

Now I wonder if iterating kobject attributes like that would be ok with
you Greg, or do you think there is a better way?
class_find_write_attr could be turned into something generic since it's
certainly convenient to reuse same table of name-write_method pairs,
but I did some cursory research and nobody else seems to need anything
of the sort in-tree.

I know ll_process_config is still awful and I will likely just
replace the current hack with kset_find_obj, but I just wanted to make
sure this new approach would be ok before spending too much time on it.

Thanks!

 drivers/staging/lustre/lustre/include/obd_class.h  |  4 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c| 10 +--
 drivers/staging/lustre/lustre/lov/lov_obd.c|  3 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c|  3 +-
 .../staging/lustre/lustre/obdclass/obd_config.c| 78 ++
 drivers/staging/lustre/lustre/osc/osc_request.c|  3 +-
 6 files changed, 44 insertions(+), 57 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
b/drivers/staging/lustre/lustre/include/obd_class.h
index 083a6ff..badafb8 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -114,8 +114,8 @@ typedef int (*llog_cb_t)(const struct lu_env *, struct 
llog_handle *,
 struct llog_rec_hdr *, void *);
 /* obd_config.c */
 int class_process_config(struct lustre_cfg *lcfg);
-int class_process_proc_param(char *prefix, struct lprocfs_vars *lvars,
-struct lustre_cfg *lcfg, void *data);
+int class_process_attr_param(char *prefix, struct kobject *kobj,
+struct lustre_cfg *lcfg);
 struct obd_device *class_incref(struct obd_device *obd,
const char *scope, const void *source);
 void class_decref(struct obd_device *obd,
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 7b80040..192b877 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -2259,7 +2259,7 @@ int ll_obd_statfs(struct inode *inode, void __user *arg)
 int ll_process_config(struct lustre_cfg *lcfg)
 {
char *ptr;
-   void *sb;
+   struct super_block *sb;
struct lprocfs_static_vars lvars;
unsigned long x;
int rc = 0;
@@ -2273,15 +2273,15 @@ int ll_process_config(struct lustre_cfg *lcfg)
rc = kstrtoul(ptr, 16, );
if (rc != 0)
return -EINVAL;
-   sb = (void *)x;
+   sb = (struct super_block *)x;
/* This better be a real Lustre superblock! */
-   LASSERT(s2lsi((struct super_block *)sb)->lsi_lmd->lmd_magic == 
LMD_MAGIC);
+   LASSERT(s2lsi(sb)->lsi_lmd->lmd_magic == LMD_MAGIC);
 
/* Note we have not called client_common_fill_super yet, so
 * proc fns must be able to handle that!
 */
-   rc = class_process_proc_param(PARAM_LLITE, lvars.obd_vars,
- lcfg, sb);
+   rc = class_process_attr_param(PARAM_LLITE, _s2sbi(sb)->ll_kobj,
+ lcfg);
if (rc > 0)
rc = 0;
return rc;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c 
b/drivers/staging/lustre/lustre/lov/lov_obd.c
index b3161fb..c33a327 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -926,8 +926,7 @@ int lov_process_config_base(struct obd_device *obd, struct 
lustre_cfg *lcfg,
 
lprocfs_lov_init_vars();
 
-   rc = class_process_proc_param(PARAM_LOV, lvars.obd_vars,
- lcfg, obd);
+   rc = class_process_attr_param(PARAM_LOV, >obd_kobj, lcfg);
if (rc > 0)
rc = 0;
goto out;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 6bc2fb8..00387b8 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2670,8 +2670,7 @@ 

[PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin
Ever since sysfs migration, class_process_proc_param stopped working
correctly as all the useful params were no longer present as lvars.
Replace all the nasty fake proc writes with hopefully less nasty
kobject attribute search and then update the attributes as needed.

Signed-off-by: Oleg Drokin 
Reported-by: Al Viro 
---
Al has quite rightfully complained in the past that class_process_proc_param
is a terrible piece of code and needs to go.
This patch is an attempt at improving it somewhat and in process drop
all the user/kernel address space games we needed to play to make it work
in the past (and which I suspect attracted Al's attention in the first place).

Now I wonder if iterating kobject attributes like that would be ok with
you Greg, or do you think there is a better way?
class_find_write_attr could be turned into something generic since it's
certainly convenient to reuse same table of name-write_method pairs,
but I did some cursory research and nobody else seems to need anything
of the sort in-tree.

I know ll_process_config is still awful and I will likely just
replace the current hack with kset_find_obj, but I just wanted to make
sure this new approach would be ok before spending too much time on it.

Thanks!

 drivers/staging/lustre/lustre/include/obd_class.h  |  4 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c| 10 +--
 drivers/staging/lustre/lustre/lov/lov_obd.c|  3 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c|  3 +-
 .../staging/lustre/lustre/obdclass/obd_config.c| 78 ++
 drivers/staging/lustre/lustre/osc/osc_request.c|  3 +-
 6 files changed, 44 insertions(+), 57 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
b/drivers/staging/lustre/lustre/include/obd_class.h
index 083a6ff..badafb8 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -114,8 +114,8 @@ typedef int (*llog_cb_t)(const struct lu_env *, struct 
llog_handle *,
 struct llog_rec_hdr *, void *);
 /* obd_config.c */
 int class_process_config(struct lustre_cfg *lcfg);
-int class_process_proc_param(char *prefix, struct lprocfs_vars *lvars,
-struct lustre_cfg *lcfg, void *data);
+int class_process_attr_param(char *prefix, struct kobject *kobj,
+struct lustre_cfg *lcfg);
 struct obd_device *class_incref(struct obd_device *obd,
const char *scope, const void *source);
 void class_decref(struct obd_device *obd,
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 7b80040..192b877 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -2259,7 +2259,7 @@ int ll_obd_statfs(struct inode *inode, void __user *arg)
 int ll_process_config(struct lustre_cfg *lcfg)
 {
char *ptr;
-   void *sb;
+   struct super_block *sb;
struct lprocfs_static_vars lvars;
unsigned long x;
int rc = 0;
@@ -2273,15 +2273,15 @@ int ll_process_config(struct lustre_cfg *lcfg)
rc = kstrtoul(ptr, 16, );
if (rc != 0)
return -EINVAL;
-   sb = (void *)x;
+   sb = (struct super_block *)x;
/* This better be a real Lustre superblock! */
-   LASSERT(s2lsi((struct super_block *)sb)->lsi_lmd->lmd_magic == 
LMD_MAGIC);
+   LASSERT(s2lsi(sb)->lsi_lmd->lmd_magic == LMD_MAGIC);
 
/* Note we have not called client_common_fill_super yet, so
 * proc fns must be able to handle that!
 */
-   rc = class_process_proc_param(PARAM_LLITE, lvars.obd_vars,
- lcfg, sb);
+   rc = class_process_attr_param(PARAM_LLITE, _s2sbi(sb)->ll_kobj,
+ lcfg);
if (rc > 0)
rc = 0;
return rc;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c 
b/drivers/staging/lustre/lustre/lov/lov_obd.c
index b3161fb..c33a327 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -926,8 +926,7 @@ int lov_process_config_base(struct obd_device *obd, struct 
lustre_cfg *lcfg,
 
lprocfs_lov_init_vars();
 
-   rc = class_process_proc_param(PARAM_LOV, lvars.obd_vars,
- lcfg, obd);
+   rc = class_process_attr_param(PARAM_LOV, >obd_kobj, lcfg);
if (rc > 0)
rc = 0;
goto out;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 6bc2fb8..00387b8 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2670,8 +2670,7 @@ static int mdc_process_config(struct obd_device *

Re: [PATCH 5/5] staging/lustre: Use generic range rwlock

2017-03-07 Thread Oleg Drokin

On Mar 7, 2017, at 12:03 AM, Davidlohr Bueso wrote:

> This replaces the in-house version, which is also derived
> from Jan's interval tree implementation.
> 
> Cc: oleg.dro...@intel.com
> Cc: andreas.dil...@intel.com
> Cc: jsimm...@infradead.org
> Cc: lustre-de...@lists.lustre.org
> 
> Signed-off-by: Davidlohr Bueso 
> ---
> XXX: compile tested only. In house uses 'ulong long', generic uses 'ulong', 
> is this a problem?

Hm, cannot seem to find the other patches in this series anywhere to verify and 
my subscription to linux-kernel broke as it turns out.
You mean the range is ulong? So only can have this working up to 2G offsets on 
the
32bit systems and then wrap around?

> 
> drivers/staging/lustre/lustre/llite/Makefile   |   2 +-
> drivers/staging/lustre/lustre/llite/file.c |  21 +-
> .../staging/lustre/lustre/llite/llite_internal.h   |   4 +-
> drivers/staging/lustre/lustre/llite/llite_lib.c|   3 +-
> drivers/staging/lustre/lustre/llite/range_lock.c   | 239 -
> drivers/staging/lustre/lustre/llite/range_lock.h   |  82 ---
> 6 files changed, 15 insertions(+), 336 deletions(-)
> delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.c
> delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.h
> 
> diff --git a/drivers/staging/lustre/lustre/llite/Makefile 
> b/drivers/staging/lustre/lustre/llite/Makefile
> index 322d4fa63f5d..922a901bc62c 100644
> --- a/drivers/staging/lustre/lustre/llite/Makefile
> +++ b/drivers/staging/lustre/lustre/llite/Makefile
> @@ -1,6 +1,6 @@
> obj-$(CONFIG_LUSTRE_FS) += lustre.o
> lustre-y := dcache.o dir.o file.o llite_lib.o llite_nfs.o \
> - rw.o rw26.o namei.o symlink.o llite_mmap.o range_lock.o \
> + rw.o rw26.o namei.o symlink.o llite_mmap.o \
>   xattr.o xattr_cache.o xattr_security.o \
>   super25.o statahead.o glimpse.o lcommon_cl.o lcommon_misc.o \
>   vvp_dev.o vvp_page.o vvp_lock.o vvp_io.o vvp_object.o \
> diff --git a/drivers/staging/lustre/lustre/llite/file.c 
> b/drivers/staging/lustre/lustre/llite/file.c
> index 481c0d01d4c6..1a14a79f87f8 100644
> --- a/drivers/staging/lustre/lustre/llite/file.c
> +++ b/drivers/staging/lustre/lustre/llite/file.c
> @@ -42,6 +42,7 @@
> #include 
> #include 
> #include 
> +#include 
> #include "../include/lustre/ll_fiemap.h"
> #include "../include/lustre/lustre_ioctl.h"
> #include "../include/lustre_swab.h"
> @@ -1055,7 +1056,7 @@ ll_file_io_generic(const struct lu_env *env, struct 
> vvp_io_args *args,
>   struct ll_inode_info *lli = ll_i2info(file_inode(file));
>   struct ll_file_data  *fd  = LUSTRE_FPRIVATE(file);
>   struct vvp_io *vio = vvp_env_io(env);
> - struct range_lock range;
> + struct range_rwlock range;
>   struct cl_io *io;
>   ssize_t result = 0;
>   int rc = 0;
> @@ -1072,9 +1073,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
> vvp_io_args *args,
>   bool range_locked = false;
> 
>   if (file->f_flags & O_APPEND)
> - range_lock_init(, 0, LUSTRE_EOF);
> + range_rwlock_init(, 0, LUSTRE_EOF);
>   else
> - range_lock_init(, *ppos, *ppos + count - 1);
> + range_rwlock_init(, *ppos, *ppos + count - 1);
> 
>   vio->vui_fd  = LUSTRE_FPRIVATE(file);
>   vio->vui_iter = args->u.normal.via_iter;
> @@ -1087,10 +1088,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
> vvp_io_args *args,
>   if (((iot == CIT_WRITE) ||
>(iot == CIT_READ && (file->f_flags & O_DIRECT))) &&
>   !(vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED)) {
> - CDEBUG(D_VFSTRACE, "Range lock [%llu, %llu]\n",
> -range.rl_node.in_extent.start,
> -range.rl_node.in_extent.end);
> - rc = range_lock(>lli_write_tree, );
> + CDEBUG(D_VFSTRACE, "Range lock [%lu, %lu]\n",
> +range.node.start, range.node.last);
> + rc = 
> range_write_lock_interruptible(>lli_write_tree, );
>   if (rc < 0)
>   goto out;
> 
> @@ -1100,10 +1100,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
> vvp_io_args *args,
>   rc = cl_io_loop(env, io);
>   ll_cl_remove(file, env);
>   if (range_locked) {
> - CDEBUG(D_VFSTRACE, "Range unlock [%llu, %llu]\n",
> -range.rl_node.in_extent.start,
> -range.rl_node.in_extent.end);
> - range_unlock(>lli_write_tree, );
> + CDEBUG(D_VFSTRACE, "Range unlock [%lu, %lu]\n",
> +range.node.start, range.node.last);
> + range_write_unlock(>lli_write_tree, );
>   }
>   } else {
> 

Re: [PATCH 5/5] staging/lustre: Use generic range rwlock

2017-03-07 Thread Oleg Drokin

On Mar 7, 2017, at 12:03 AM, Davidlohr Bueso wrote:

> This replaces the in-house version, which is also derived
> from Jan's interval tree implementation.
> 
> Cc: oleg.dro...@intel.com
> Cc: andreas.dil...@intel.com
> Cc: jsimm...@infradead.org
> Cc: lustre-de...@lists.lustre.org
> 
> Signed-off-by: Davidlohr Bueso 
> ---
> XXX: compile tested only. In house uses 'ulong long', generic uses 'ulong', 
> is this a problem?

Hm, cannot seem to find the other patches in this series anywhere to verify and 
my subscription to linux-kernel broke as it turns out.
You mean the range is ulong? So only can have this working up to 2G offsets on 
the
32bit systems and then wrap around?

> 
> drivers/staging/lustre/lustre/llite/Makefile   |   2 +-
> drivers/staging/lustre/lustre/llite/file.c |  21 +-
> .../staging/lustre/lustre/llite/llite_internal.h   |   4 +-
> drivers/staging/lustre/lustre/llite/llite_lib.c|   3 +-
> drivers/staging/lustre/lustre/llite/range_lock.c   | 239 -
> drivers/staging/lustre/lustre/llite/range_lock.h   |  82 ---
> 6 files changed, 15 insertions(+), 336 deletions(-)
> delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.c
> delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.h
> 
> diff --git a/drivers/staging/lustre/lustre/llite/Makefile 
> b/drivers/staging/lustre/lustre/llite/Makefile
> index 322d4fa63f5d..922a901bc62c 100644
> --- a/drivers/staging/lustre/lustre/llite/Makefile
> +++ b/drivers/staging/lustre/lustre/llite/Makefile
> @@ -1,6 +1,6 @@
> obj-$(CONFIG_LUSTRE_FS) += lustre.o
> lustre-y := dcache.o dir.o file.o llite_lib.o llite_nfs.o \
> - rw.o rw26.o namei.o symlink.o llite_mmap.o range_lock.o \
> + rw.o rw26.o namei.o symlink.o llite_mmap.o \
>   xattr.o xattr_cache.o xattr_security.o \
>   super25.o statahead.o glimpse.o lcommon_cl.o lcommon_misc.o \
>   vvp_dev.o vvp_page.o vvp_lock.o vvp_io.o vvp_object.o \
> diff --git a/drivers/staging/lustre/lustre/llite/file.c 
> b/drivers/staging/lustre/lustre/llite/file.c
> index 481c0d01d4c6..1a14a79f87f8 100644
> --- a/drivers/staging/lustre/lustre/llite/file.c
> +++ b/drivers/staging/lustre/lustre/llite/file.c
> @@ -42,6 +42,7 @@
> #include 
> #include 
> #include 
> +#include 
> #include "../include/lustre/ll_fiemap.h"
> #include "../include/lustre/lustre_ioctl.h"
> #include "../include/lustre_swab.h"
> @@ -1055,7 +1056,7 @@ ll_file_io_generic(const struct lu_env *env, struct 
> vvp_io_args *args,
>   struct ll_inode_info *lli = ll_i2info(file_inode(file));
>   struct ll_file_data  *fd  = LUSTRE_FPRIVATE(file);
>   struct vvp_io *vio = vvp_env_io(env);
> - struct range_lock range;
> + struct range_rwlock range;
>   struct cl_io *io;
>   ssize_t result = 0;
>   int rc = 0;
> @@ -1072,9 +1073,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
> vvp_io_args *args,
>   bool range_locked = false;
> 
>   if (file->f_flags & O_APPEND)
> - range_lock_init(, 0, LUSTRE_EOF);
> + range_rwlock_init(, 0, LUSTRE_EOF);
>   else
> - range_lock_init(, *ppos, *ppos + count - 1);
> + range_rwlock_init(, *ppos, *ppos + count - 1);
> 
>   vio->vui_fd  = LUSTRE_FPRIVATE(file);
>   vio->vui_iter = args->u.normal.via_iter;
> @@ -1087,10 +1088,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
> vvp_io_args *args,
>   if (((iot == CIT_WRITE) ||
>(iot == CIT_READ && (file->f_flags & O_DIRECT))) &&
>   !(vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED)) {
> - CDEBUG(D_VFSTRACE, "Range lock [%llu, %llu]\n",
> -range.rl_node.in_extent.start,
> -range.rl_node.in_extent.end);
> - rc = range_lock(>lli_write_tree, );
> + CDEBUG(D_VFSTRACE, "Range lock [%lu, %lu]\n",
> +range.node.start, range.node.last);
> + rc = 
> range_write_lock_interruptible(>lli_write_tree, );
>   if (rc < 0)
>   goto out;
> 
> @@ -1100,10 +1100,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
> vvp_io_args *args,
>   rc = cl_io_loop(env, io);
>   ll_cl_remove(file, env);
>   if (range_locked) {
> - CDEBUG(D_VFSTRACE, "Range unlock [%llu, %llu]\n",
> -range.rl_node.in_extent.start,
> -range.rl_node.in_extent.end);
> - range_unlock(>lli_write_tree, );
> + CDEBUG(D_VFSTRACE, "Range unlock [%lu, %lu]\n",
> +range.node.start, range.node.last);
> + range_write_unlock(>lli_write_tree, );
>   }
>   } else {
>   /* 

Re: [lustre-devel] [PATCH] staging: lustre: fix sparse warning about different address spaces

2017-03-06 Thread Oleg Drokin

On Mar 1, 2017, at 6:57 PM, Mario Bambagini wrote:

> fixed the following sparse warning by adding proper cast:
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74: warning: 
> incorrect type in argument 2 (different address spaces)
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:expected 
> char const [noderef] *
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:got char 
> *[assigned] sval
> 
> Signed-off-by: Mario Bambagini 

The patch is fine, but just be advised this whole function is going away real 
soon now
per Al Viro request (and also because it no longer does what it should).

> ---
> drivers/staging/lustre/lustre/obdclass/obd_config.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c 
> b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> index 9ca84c7..8fce88f 100644
> --- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
> +++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> @@ -1052,7 +1052,8 @@ int class_process_proc_param(char *prefix, struct 
> lprocfs_vars *lvars,
> 
>   oldfs = get_fs();
>   set_fs(KERNEL_DS);
> - rc = var->fops->write(, sval,
> + rc = var->fops->write(,
> + (const char __user *)sval,
>   vallen, NULL);
>   set_fs(oldfs);
>   }
> -- 
> 2.1.4
> 
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org



Re: [lustre-devel] [PATCH] staging: lustre: fix sparse warning about different address spaces

2017-03-06 Thread Oleg Drokin

On Mar 1, 2017, at 6:57 PM, Mario Bambagini wrote:

> fixed the following sparse warning by adding proper cast:
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74: warning: 
> incorrect type in argument 2 (different address spaces)
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:expected 
> char const [noderef] *
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:got char 
> *[assigned] sval
> 
> Signed-off-by: Mario Bambagini 

The patch is fine, but just be advised this whole function is going away real 
soon now
per Al Viro request (and also because it no longer does what it should).

> ---
> drivers/staging/lustre/lustre/obdclass/obd_config.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c 
> b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> index 9ca84c7..8fce88f 100644
> --- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
> +++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> @@ -1052,7 +1052,8 @@ int class_process_proc_param(char *prefix, struct 
> lprocfs_vars *lvars,
> 
>   oldfs = get_fs();
>   set_fs(KERNEL_DS);
> - rc = var->fops->write(, sval,
> + rc = var->fops->write(,
> + (const char __user *)sval,
>   vallen, NULL);
>   set_fs(oldfs);
>   }
> -- 
> 2.1.4
> 
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org



Re: [PATCH 13/14] staging: lustre: llog: limit file size of plain logs

2017-02-24 Thread Oleg Drokin

On Feb 24, 2017, at 11:59 AM, Greg Kroah-Hartman wrote:

> On Sat, Feb 18, 2017 at 04:47:14PM -0500, James Simmons wrote:
>> From: Alex Zhuravlev <alexey.zhurav...@intel.com>
>> 
>> on small filesystems plain log can grow dramatically. especially
>> given large record sizes produced by DNE and extended chunksize.
>> I saw >50% of space consumed by a single llog file which was still
>> in use. this leads to test failures (sanityn, etc).
>> the patch introduces additional limit on plain llog size, which
>> is calculated as /64 (128MB at most) at llog creation
>> time.
>> 
>> Signed-off-by: Alex Zhuravlev <alexey.zhurav...@intel.com>
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6838
>> Reviewed-on: https://review.whamcloud.com/18028
>> Reviewed-by: Andreas Dilger <andreas.dil...@intel.com>
>> Reviewed-by: wangdi <di.w...@intel.com>
>> Reviewed-by: Mike Pershin <mike.pers...@intel.com>
>> Reviewed-by: Oleg Drokin <oleg.dro...@intel.com>
>> Signed-off-by: James Simmons <jsimm...@infradead.org>
>> ---
>> drivers/staging/lustre/lustre/obdclass/llog.c | 16 
>> 1 file changed, 16 insertions(+)
>> 
>> diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c 
>> b/drivers/staging/lustre/lustre/obdclass/llog.c
>> index 83c5b62..320ff6b 100644
>> --- a/drivers/staging/lustre/lustre/obdclass/llog.c
>> +++ b/drivers/staging/lustre/lustre/obdclass/llog.c
>> @@ -319,10 +319,26 @@ static int llog_process_thread(void *arg)
>>   * the case and re-read the current chunk
>>   * otherwise.
>>   */
>> +int records;
>> +
>>  if (index > loghandle->lgh_last_idx) {
>>  rc = 0;
>>  goto out;
>>  }
>> +/* <2 records means no more records
>> + * if the last record we processed was
>> + * the final one, then the underlying
>> + * object might have been destroyed yet.
>> + * we better don't access that..
>> + */
>> +mutex_lock(>lgh_hdr_mutex);
>> +records = loghandle->lgh_hdr->llh_count;
>> +mutex_unlock(>lgh_hdr_mutex);
>> +if (records <= 1) {
>> +rc = 0;
>> +goto out;
>> +}
> 
> 
> So you now use the lock, in only one place, when reading a single value?
> That makes no sense, it's obviously wrong, or not needed.
> 
> Please fix up these two patches…

Ah, this is in fact server-side fix, so all the other users were in the
parts not really present in the client.
James, we don't really need this patch in the client, I guess.



Re: [PATCH 13/14] staging: lustre: llog: limit file size of plain logs

2017-02-24 Thread Oleg Drokin

On Feb 24, 2017, at 11:59 AM, Greg Kroah-Hartman wrote:

> On Sat, Feb 18, 2017 at 04:47:14PM -0500, James Simmons wrote:
>> From: Alex Zhuravlev <alexey.zhurav...@intel.com>
>> 
>> on small filesystems plain log can grow dramatically. especially
>> given large record sizes produced by DNE and extended chunksize.
>> I saw >50% of space consumed by a single llog file which was still
>> in use. this leads to test failures (sanityn, etc).
>> the patch introduces additional limit on plain llog size, which
>> is calculated as /64 (128MB at most) at llog creation
>> time.
>> 
>> Signed-off-by: Alex Zhuravlev <alexey.zhurav...@intel.com>
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6838
>> Reviewed-on: https://review.whamcloud.com/18028
>> Reviewed-by: Andreas Dilger <andreas.dil...@intel.com>
>> Reviewed-by: wangdi <di.w...@intel.com>
>> Reviewed-by: Mike Pershin <mike.pers...@intel.com>
>> Reviewed-by: Oleg Drokin <oleg.dro...@intel.com>
>> Signed-off-by: James Simmons <jsimm...@infradead.org>
>> ---
>> drivers/staging/lustre/lustre/obdclass/llog.c | 16 
>> 1 file changed, 16 insertions(+)
>> 
>> diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c 
>> b/drivers/staging/lustre/lustre/obdclass/llog.c
>> index 83c5b62..320ff6b 100644
>> --- a/drivers/staging/lustre/lustre/obdclass/llog.c
>> +++ b/drivers/staging/lustre/lustre/obdclass/llog.c
>> @@ -319,10 +319,26 @@ static int llog_process_thread(void *arg)
>>   * the case and re-read the current chunk
>>   * otherwise.
>>   */
>> +int records;
>> +
>>  if (index > loghandle->lgh_last_idx) {
>>  rc = 0;
>>  goto out;
>>  }
>> +/* <2 records means no more records
>> + * if the last record we processed was
>> + * the final one, then the underlying
>> + * object might have been destroyed yet.
>> + * we better don't access that..
>> + */
>> +mutex_lock(>lgh_hdr_mutex);
>> +records = loghandle->lgh_hdr->llh_count;
>> +mutex_unlock(>lgh_hdr_mutex);
>> +if (records <= 1) {
>> +rc = 0;
>> +goto out;
>> +}
> 
> 
> So you now use the lock, in only one place, when reading a single value?
> That makes no sense, it's obviously wrong, or not needed.
> 
> Please fix up these two patches…

Ah, this is in fact server-side fix, so all the other users were in the
parts not really present in the client.
James, we don't really need this patch in the client, I guess.



Re: [PATCH 13/14] staging: lustre: llog: limit file size of plain logs

2017-02-24 Thread Oleg Drokin

On Feb 24, 2017, at 11:59 AM, Greg Kroah-Hartman wrote:

> On Sat, Feb 18, 2017 at 04:47:14PM -0500, James Simmons wrote:
>> From: Alex Zhuravlev 
>> 
>> on small filesystems plain log can grow dramatically. especially
>> given large record sizes produced by DNE and extended chunksize.
>> I saw >50% of space consumed by a single llog file which was still
>> in use. this leads to test failures (sanityn, etc).
>> the patch introduces additional limit on plain llog size, which
>> is calculated as /64 (128MB at most) at llog creation
>> time.
>> 
>> Signed-off-by: Alex Zhuravlev 
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6838
>> Reviewed-on: https://review.whamcloud.com/18028
>> Reviewed-by: Andreas Dilger 
>> Reviewed-by: wangdi 
>> Reviewed-by: Mike Pershin 
>> Reviewed-by: Oleg Drokin 
>> Signed-off-by: James Simmons 
>> ---
>> drivers/staging/lustre/lustre/obdclass/llog.c | 16 
>> 1 file changed, 16 insertions(+)
>> 
>> diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c 
>> b/drivers/staging/lustre/lustre/obdclass/llog.c
>> index 83c5b62..320ff6b 100644
>> --- a/drivers/staging/lustre/lustre/obdclass/llog.c
>> +++ b/drivers/staging/lustre/lustre/obdclass/llog.c
>> @@ -319,10 +319,26 @@ static int llog_process_thread(void *arg)
>>   * the case and re-read the current chunk
>>   * otherwise.
>>   */
>> +int records;
>> +
>>  if (index > loghandle->lgh_last_idx) {
>>  rc = 0;
>>  goto out;
>>  }
>> +/* <2 records means no more records
>> + * if the last record we processed was
>> + * the final one, then the underlying
>> + * object might have been destroyed yet.
>> + * we better don't access that..
>> + */
>> +mutex_lock(>lgh_hdr_mutex);
>> +records = loghandle->lgh_hdr->llh_count;
>> +mutex_unlock(>lgh_hdr_mutex);
>> +if (records <= 1) {
>> +rc = 0;
>> +goto out;
>> +}
> 
> 
> So you now use the lock, in only one place, when reading a single value?
> That makes no sense, it's obviously wrong, or not needed.
> 
> Please fix up these two patches…

Ah, this is in fact server-side fix, so all the other users were in the
parts not really present in the client.
James, we don't really need this patch in the client, I guess.



Re: [PATCH 13/14] staging: lustre: llog: limit file size of plain logs

2017-02-24 Thread Oleg Drokin

On Feb 24, 2017, at 11:59 AM, Greg Kroah-Hartman wrote:

> On Sat, Feb 18, 2017 at 04:47:14PM -0500, James Simmons wrote:
>> From: Alex Zhuravlev 
>> 
>> on small filesystems plain log can grow dramatically. especially
>> given large record sizes produced by DNE and extended chunksize.
>> I saw >50% of space consumed by a single llog file which was still
>> in use. this leads to test failures (sanityn, etc).
>> the patch introduces additional limit on plain llog size, which
>> is calculated as /64 (128MB at most) at llog creation
>> time.
>> 
>> Signed-off-by: Alex Zhuravlev 
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6838
>> Reviewed-on: https://review.whamcloud.com/18028
>> Reviewed-by: Andreas Dilger 
>> Reviewed-by: wangdi 
>> Reviewed-by: Mike Pershin 
>> Reviewed-by: Oleg Drokin 
>> Signed-off-by: James Simmons 
>> ---
>> drivers/staging/lustre/lustre/obdclass/llog.c | 16 
>> 1 file changed, 16 insertions(+)
>> 
>> diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c 
>> b/drivers/staging/lustre/lustre/obdclass/llog.c
>> index 83c5b62..320ff6b 100644
>> --- a/drivers/staging/lustre/lustre/obdclass/llog.c
>> +++ b/drivers/staging/lustre/lustre/obdclass/llog.c
>> @@ -319,10 +319,26 @@ static int llog_process_thread(void *arg)
>>   * the case and re-read the current chunk
>>   * otherwise.
>>   */
>> +int records;
>> +
>>  if (index > loghandle->lgh_last_idx) {
>>  rc = 0;
>>  goto out;
>>  }
>> +/* <2 records means no more records
>> + * if the last record we processed was
>> + * the final one, then the underlying
>> + * object might have been destroyed yet.
>> + * we better don't access that..
>> + */
>> +mutex_lock(>lgh_hdr_mutex);
>> +records = loghandle->lgh_hdr->llh_count;
>> +mutex_unlock(>lgh_hdr_mutex);
>> +if (records <= 1) {
>> +rc = 0;
>> +goto out;
>> +}
> 
> 
> So you now use the lock, in only one place, when reading a single value?
> That makes no sense, it's obviously wrong, or not needed.
> 
> Please fix up these two patches…

Ah, this is in fact server-side fix, so all the other users were in the
parts not really present in the client.
James, we don't really need this patch in the client, I guess.



Re: [PATCH v2] Style fixes

2017-01-16 Thread Oleg Drokin

On Jan 13, 2017, at 3:48 PM, Guillermo O. Freschi wrote:

> Missing braces on `if` statement.
> 
> Signed-off-by: Guillermo O. Freschi <ked...@gmail.com>
> 
> Reviewed-by: Andreas Dilger <andreas.dil...@intel.com>

Reviewed-by: Oleg Drokin <oleg.dro...@intel.com>
> ---
> drivers/staging/lustre/lustre/ldlm/ldlm_lock.c | 9 +
> 1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c 
> b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
> index f4cbc89b4f24..b66bc02646f1 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
> @@ -1024,11 +1024,11 @@ void ldlm_grant_lock(struct ldlm_lock *lock, struct 
> list_head *work_list)
>   if (work_list && lock->l_completion_ast)
>   ldlm_add_ast_work_item(lock, NULL, work_list);
> 
> - if (res->lr_type == LDLM_PLAIN || res->lr_type == LDLM_IBITS)
> + if (res->lr_type == LDLM_PLAIN || res->lr_type == LDLM_IBITS) {
>   ldlm_grant_lock_with_skiplist(lock);
> - else if (res->lr_type == LDLM_EXTENT)
> + } else if (res->lr_type == LDLM_EXTENT) {
>   ldlm_extent_add_lock(res, lock);
> - else if (res->lr_type == LDLM_FLOCK) {
> + } else if (res->lr_type == LDLM_FLOCK) {
>   /*
>* We should not add locks to granted list in the following 
> cases:
>* - this is an UNLOCK but not a real lock;
> @@ -1040,8 +1040,9 @@ void ldlm_grant_lock(struct ldlm_lock *lock, struct 
> list_head *work_list)
>   ldlm_is_test_lock(lock) || ldlm_is_flock_deadlock(lock))
>   return;
>   ldlm_resource_add_lock(res, >lr_granted, lock);
> - } else
> + } else {
>   LBUG();
> + }
> 
>   ldlm_pool_add(_res_to_ns(res)->ns_pool, lock);
> }
> -- 
> 2.11.0



Re: [PATCH v2] Style fixes

2017-01-16 Thread Oleg Drokin

On Jan 13, 2017, at 3:48 PM, Guillermo O. Freschi wrote:

> Missing braces on `if` statement.
> 
> Signed-off-by: Guillermo O. Freschi 
> 
> Reviewed-by: Andreas Dilger 

Reviewed-by: Oleg Drokin 
> ---
> drivers/staging/lustre/lustre/ldlm/ldlm_lock.c | 9 +
> 1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c 
> b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
> index f4cbc89b4f24..b66bc02646f1 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
> @@ -1024,11 +1024,11 @@ void ldlm_grant_lock(struct ldlm_lock *lock, struct 
> list_head *work_list)
>   if (work_list && lock->l_completion_ast)
>   ldlm_add_ast_work_item(lock, NULL, work_list);
> 
> - if (res->lr_type == LDLM_PLAIN || res->lr_type == LDLM_IBITS)
> + if (res->lr_type == LDLM_PLAIN || res->lr_type == LDLM_IBITS) {
>   ldlm_grant_lock_with_skiplist(lock);
> - else if (res->lr_type == LDLM_EXTENT)
> + } else if (res->lr_type == LDLM_EXTENT) {
>   ldlm_extent_add_lock(res, lock);
> - else if (res->lr_type == LDLM_FLOCK) {
> + } else if (res->lr_type == LDLM_FLOCK) {
>   /*
>* We should not add locks to granted list in the following 
> cases:
>* - this is an UNLOCK but not a real lock;
> @@ -1040,8 +1040,9 @@ void ldlm_grant_lock(struct ldlm_lock *lock, struct 
> list_head *work_list)
>   ldlm_is_test_lock(lock) || ldlm_is_flock_deadlock(lock))
>   return;
>   ldlm_resource_add_lock(res, >lr_granted, lock);
> - } else
> + } else {
>   LBUG();
> + }
> 
>   ldlm_pool_add(_res_to_ns(res)->ns_pool, lock);
> }
> -- 
> 2.11.0



Re: [PATCH 4/5] staging/lustre/obdclass: Combine two seq_printf() calls into one call in lprocfs_rd_state()

2017-01-12 Thread Oleg Drokin

On Jan 1, 2017, at 11:38 AM, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Sun, 1 Jan 2017 16:26:36 +0100
> 
> Some data were printed into a sequence by two separate function calls.
> Print the same data by a single function call instead.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
> ---
> drivers/staging/lustre/lustre/obdclass/lprocfs_status.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c 
> b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> index 3f6fcab5a1fc..a167cbe8a50e 100644
> --- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> +++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> @@ -853,10 +853,8 @@ int lprocfs_rd_state(struct seq_file *m, void *data)
>   return rc;
> 
>   imp = obd->u.cli.cl_import;
> -
> - seq_printf(m, "current_state: %s\n",
> + seq_printf(m, "current_state: %s\nstate_history:\n",
>  ptlrpc_import_state_name(imp->imp_state));
> - seq_printf(m, "state_history:\n");

same as in that other patch - this actually makes the code a bit harder to read,
what's the perceived benefit to make a change like this?

>   k = imp->imp_state_hist_idx;
>   for (j = 0; j < IMP_STATE_HIST_LEN; j++) {
>   struct import_state_hist *ish =
> -- 
> 2.11.0



Re: [PATCH 4/5] staging/lustre/obdclass: Combine two seq_printf() calls into one call in lprocfs_rd_state()

2017-01-12 Thread Oleg Drokin

On Jan 1, 2017, at 11:38 AM, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Sun, 1 Jan 2017 16:26:36 +0100
> 
> Some data were printed into a sequence by two separate function calls.
> Print the same data by a single function call instead.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
> ---
> drivers/staging/lustre/lustre/obdclass/lprocfs_status.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c 
> b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> index 3f6fcab5a1fc..a167cbe8a50e 100644
> --- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> +++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> @@ -853,10 +853,8 @@ int lprocfs_rd_state(struct seq_file *m, void *data)
>   return rc;
> 
>   imp = obd->u.cli.cl_import;
> -
> - seq_printf(m, "current_state: %s\n",
> + seq_printf(m, "current_state: %s\nstate_history:\n",
>  ptlrpc_import_state_name(imp->imp_state));
> - seq_printf(m, "state_history:\n");

same as in that other patch - this actually makes the code a bit harder to read,
what's the perceived benefit to make a change like this?

>   k = imp->imp_state_hist_idx;
>   for (j = 0; j < IMP_STATE_HIST_LEN; j++) {
>   struct import_state_hist *ish =
> -- 
> 2.11.0



Re: [PATCH 2/5] staging/lustre/mgc: Combine two seq_printf() calls into one call in lprocfs_mgc_rd_ir_state()

2017-01-12 Thread Oleg Drokin

On Jan 1, 2017, at 11:35 AM, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Sun, 1 Jan 2017 15:40:29 +0100
> 
> Some data were printed into a sequence by two separate function calls.
> Print the same data by a single function call instead.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
> ---
> drivers/staging/lustre/lustre/mgc/mgc_request.c | 5 +
> 1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c 
> b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> index b9c522a3c7a4..a6ca48d7e96b 100644
> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> @@ -460,11 +460,8 @@ int lprocfs_mgc_rd_ir_state(struct seq_file *m, void 
> *data)
> 
>   imp = obd->u.cli.cl_import;
>   ocd = >imp_connect_data;
> -
> - seq_printf(m, "imperative_recovery: %s\n",
> + seq_printf(m, "imperative_recovery: %s\nclient_state:\n",
>  OCD_HAS_FLAG(ocd, IMP_RECOV) ? "ENABLED" : "DISABLED");
> - seq_printf(m, "client_state:\n");
> -

Ugh, do we really need this?
I know it saves one call to seq_printf, but this is not a super 
performance-critical
code, and two calls are actually easier to read, don't you think?

>   spin_lock(_list_lock);
>   list_for_each_entry(cld, _llog_list, cld_list_chain) {
>   if (!cld->cld_recover)
> -- 
> 2.11.0



Re: [PATCH 2/5] staging/lustre/mgc: Combine two seq_printf() calls into one call in lprocfs_mgc_rd_ir_state()

2017-01-12 Thread Oleg Drokin

On Jan 1, 2017, at 11:35 AM, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Sun, 1 Jan 2017 15:40:29 +0100
> 
> Some data were printed into a sequence by two separate function calls.
> Print the same data by a single function call instead.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
> ---
> drivers/staging/lustre/lustre/mgc/mgc_request.c | 5 +
> 1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c 
> b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> index b9c522a3c7a4..a6ca48d7e96b 100644
> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> @@ -460,11 +460,8 @@ int lprocfs_mgc_rd_ir_state(struct seq_file *m, void 
> *data)
> 
>   imp = obd->u.cli.cl_import;
>   ocd = >imp_connect_data;
> -
> - seq_printf(m, "imperative_recovery: %s\n",
> + seq_printf(m, "imperative_recovery: %s\nclient_state:\n",
>  OCD_HAS_FLAG(ocd, IMP_RECOV) ? "ENABLED" : "DISABLED");
> - seq_printf(m, "client_state:\n");
> -

Ugh, do we really need this?
I know it saves one call to seq_printf, but this is not a super 
performance-critical
code, and two calls are actually easier to read, don't you think?

>   spin_lock(_list_lock);
>   list_for_each_entry(cld, _llog_list, cld_list_chain) {
>   if (!cld->cld_recover)
> -- 
> 2.11.0



Re: [PATCH v2 5/5] staging: lustre: headers: use proper byteorder functions in lustre_idl.h

2016-12-13 Thread Oleg Drokin

On Dec 13, 2016, at 3:31 AM, Dan Carpenter wrote:

> It used to be that great swathes of Lustre were used in both user space
> and kernel space.  We had huge unused modules in the kernel that were
> only used for user space.

Huh?
There was nothing of the sort.
There were huge parts of code that were used by the server, but sue to no server
in staging client, ended up being unused, though.

There were also (much smaller) bits that were supporting userspace client
(that is, a library that was able to mount lustre servers completely from
userspace by hijacking libc calls), but that was mostly gone by the time
we got into staging anyway.



Re: [PATCH v2 5/5] staging: lustre: headers: use proper byteorder functions in lustre_idl.h

2016-12-13 Thread Oleg Drokin

On Dec 13, 2016, at 3:31 AM, Dan Carpenter wrote:

> It used to be that great swathes of Lustre were used in both user space
> and kernel space.  We had huge unused modules in the kernel that were
> only used for user space.

Huh?
There was nothing of the sort.
There were huge parts of code that were used by the server, but sue to no server
in staging client, ended up being unused, though.

There were also (much smaller) bits that were supporting userspace client
(that is, a library that was able to mount lustre servers completely from
userspace by hijacking libc calls), but that was mostly gone by the time
we got into staging anyway.



Re: [PATCH 0/8] Sparse warning fixes in Lustre.

2016-12-12 Thread Oleg Drokin

On Dec 7, 2016, at 6:46 PM, Al Viro wrote:

> On Wed, Dec 07, 2016 at 05:41:26PM -0500, Oleg Drokin wrote:
>> This set of fixes aims at sparse warnings.
> 
> Speaking of the stuff sparse catches there: class_process_proc_param().
> I've tried to describe what I think of that Fine Piece Of Software
> several times, but I had to give up - my command of obscenity is not
> up to the task, neither in English nor in Russian.  Please, take it
> out.  Preferably - along with the ->ldo_process_config()/->process_config()
> thing.

Well, I can guess what you don't like in the remnants of the
"well, we have uniform procfs, so let's use that to our advantage and simplify
or config parsing".

But what's your beef with ldo_process_config()/->process_config(), I wonder?
Just a way to propagate config info across the layers.




Re: [PATCH 0/8] Sparse warning fixes in Lustre.

2016-12-12 Thread Oleg Drokin

On Dec 7, 2016, at 6:46 PM, Al Viro wrote:

> On Wed, Dec 07, 2016 at 05:41:26PM -0500, Oleg Drokin wrote:
>> This set of fixes aims at sparse warnings.
> 
> Speaking of the stuff sparse catches there: class_process_proc_param().
> I've tried to describe what I think of that Fine Piece Of Software
> several times, but I had to give up - my command of obscenity is not
> up to the task, neither in English nor in Russian.  Please, take it
> out.  Preferably - along with the ->ldo_process_config()/->process_config()
> thing.

Well, I can guess what you don't like in the remnants of the
"well, we have uniform procfs, so let's use that to our advantage and simplify
or config parsing".

But what's your beef with ldo_process_config()/->process_config(), I wonder?
Just a way to propagate config info across the layers.




[PATCH 3/8] staging/lustre/llite: mark ll_io_init() static

2016-12-07 Thread Oleg Drokin
It's not used anywhere out of this file.
Highlighted by sparse.

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/llite/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index f634c11..d93f06a 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1016,7 +1016,7 @@ static bool file_is_noatime(const struct file *file)
return false;
 }
 
-void ll_io_init(struct cl_io *io, const struct file *file, int write)
+static void ll_io_init(struct cl_io *io, const struct file *file, int write)
 {
struct inode *inode = file_inode(file);
 
-- 
2.7.4



[PATCH 3/8] staging/lustre/llite: mark ll_io_init() static

2016-12-07 Thread Oleg Drokin
It's not used anywhere out of this file.
Highlighted by sparse.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index f634c11..d93f06a 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1016,7 +1016,7 @@ static bool file_is_noatime(const struct file *file)
return false;
 }
 
-void ll_io_init(struct cl_io *io, const struct file *file, int write)
+static void ll_io_init(struct cl_io *io, const struct file *file, int write)
 {
struct inode *inode = file_inode(file);
 
-- 
2.7.4



[PATCH 4/8] staging/lustre/lov: make lov_lsm_alloc() static

2016-12-07 Thread Oleg Drokin
It's not used anywhere outside of this file.
Highlighted by sparse.

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/lov/lov_pack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c 
b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 6c93d18..68fa2de 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -198,7 +198,8 @@ static int lov_verify_lmm(void *lmm, int lmm_bytes, __u16 
*stripe_count)
return rc;
 }
 
-struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern, u32 magic)
+static struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern,
+  u32 magic)
 {
struct lov_stripe_md *lsm;
unsigned int i;
-- 
2.7.4



[PATCH 1/8] staging/lustre/llite: move root_squash from sysfs to debugfs

2016-12-07 Thread Oleg Drokin
root_squash control got accidentally moved to sysfs instead of
debugfs, and the write side of it was also broken expecting a
userspace buffer.
It contains both uid and gid values in a single file, so debugfs
is a clear place for it.

Reported-by: Al Viro <v...@zeniv.linux.org.uk>
Fixes: c948390f10ccc "fix inconsistencies of root squash feature"
Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/llite/lproc_llite.c | 27 +--
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c 
b/drivers/staging/lustre/lustre/llite/lproc_llite.c
index 03682c1..f3ee584 100644
--- a/drivers/staging/lustre/lustre/llite/lproc_llite.c
+++ b/drivers/staging/lustre/lustre/llite/lproc_llite.c
@@ -924,27 +924,29 @@ static ssize_t ll_unstable_stats_seq_write(struct file 
*file,
 }
 LPROC_SEQ_FOPS(ll_unstable_stats);
 
-static ssize_t root_squash_show(struct kobject *kobj, struct attribute *attr,
-   char *buf)
+static int ll_root_squash_seq_show(struct seq_file *m, void *v)
 {
-   struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info,
- ll_kobj);
+   struct super_block *sb = m->private;
+   struct ll_sb_info *sbi = ll_s2sbi(sb);
struct root_squash_info *squash = >ll_squash;
 
-   return sprintf(buf, "%u:%u\n", squash->rsi_uid, squash->rsi_gid);
+   seq_printf(m, "%u:%u\n", squash->rsi_uid, squash->rsi_gid);
+   return 0;
 }
 
-static ssize_t root_squash_store(struct kobject *kobj, struct attribute *attr,
-const char *buffer, size_t count)
+static ssize_t ll_root_squash_seq_write(struct file *file,
+   const char __user *buffer,
+   size_t count, loff_t *off)
 {
-   struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info,
- ll_kobj);
+   struct seq_file *m = file->private_data;
+   struct super_block *sb = m->private;
+   struct ll_sb_info *sbi = ll_s2sbi(sb);
struct root_squash_info *squash = >ll_squash;
 
return lprocfs_wr_root_squash(buffer, count, squash,
- ll_get_fsname(sbi->ll_sb, NULL, 0));
+ ll_get_fsname(sb, NULL, 0));
 }
-LUSTRE_RW_ATTR(root_squash);
+LPROC_SEQ_FOPS(ll_root_squash);
 
 static int ll_nosquash_nids_seq_show(struct seq_file *m, void *v)
 {
@@ -997,6 +999,8 @@ static struct lprocfs_vars lprocfs_llite_obd_vars[] = {
{ "statahead_stats",  _statahead_stats_fops, NULL, 0 },
{ "unstable_stats",   _unstable_stats_fops, NULL },
{ "sbi_flags",_sbi_flags_fops, NULL, 0 },
+   { .name =   "root_squash",
+ .fops =   _root_squash_fops},
{ .name =   "nosquash_nids",
  .fops =   _nosquash_nids_fops  },
{ NULL }
@@ -1027,7 +1031,6 @@ static struct attribute *llite_attrs[] = {
_attr_max_easize.attr,
_attr_default_easize.attr,
_attr_xattr_cache.attr,
-   _attr_root_squash.attr,
NULL,
 };
 
-- 
2.7.4



[PATCH 2/8] staging/lustre/ldlm: Correct itree_overlap_cb return type

2016-12-07 Thread Oleg Drokin
As per interval_search() prototype, the callback should return
enum, not int.
This fixes correspondign sparse warning.

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index a4a291a..f4cbc89 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -1148,7 +1148,7 @@ static int lock_matches(struct ldlm_lock *lock, struct 
lock_match_data *data)
return INTERVAL_ITER_STOP;
 }
 
-static unsigned int itree_overlap_cb(struct interval_node *in, void *args)
+static enum interval_iter itree_overlap_cb(struct interval_node *in, void 
*args)
 {
struct ldlm_interval *node = to_ldlm_interval(in);
struct lock_match_data *data = args;
-- 
2.7.4



[PATCH 4/8] staging/lustre/lov: make lov_lsm_alloc() static

2016-12-07 Thread Oleg Drokin
It's not used anywhere outside of this file.
Highlighted by sparse.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/lov/lov_pack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c 
b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 6c93d18..68fa2de 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -198,7 +198,8 @@ static int lov_verify_lmm(void *lmm, int lmm_bytes, __u16 
*stripe_count)
return rc;
 }
 
-struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern, u32 magic)
+static struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern,
+  u32 magic)
 {
struct lov_stripe_md *lsm;
unsigned int i;
-- 
2.7.4



[PATCH 1/8] staging/lustre/llite: move root_squash from sysfs to debugfs

2016-12-07 Thread Oleg Drokin
root_squash control got accidentally moved to sysfs instead of
debugfs, and the write side of it was also broken expecting a
userspace buffer.
It contains both uid and gid values in a single file, so debugfs
is a clear place for it.

Reported-by: Al Viro 
Fixes: c948390f10ccc "fix inconsistencies of root squash feature"
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/lproc_llite.c | 27 +--
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c 
b/drivers/staging/lustre/lustre/llite/lproc_llite.c
index 03682c1..f3ee584 100644
--- a/drivers/staging/lustre/lustre/llite/lproc_llite.c
+++ b/drivers/staging/lustre/lustre/llite/lproc_llite.c
@@ -924,27 +924,29 @@ static ssize_t ll_unstable_stats_seq_write(struct file 
*file,
 }
 LPROC_SEQ_FOPS(ll_unstable_stats);
 
-static ssize_t root_squash_show(struct kobject *kobj, struct attribute *attr,
-   char *buf)
+static int ll_root_squash_seq_show(struct seq_file *m, void *v)
 {
-   struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info,
- ll_kobj);
+   struct super_block *sb = m->private;
+   struct ll_sb_info *sbi = ll_s2sbi(sb);
struct root_squash_info *squash = >ll_squash;
 
-   return sprintf(buf, "%u:%u\n", squash->rsi_uid, squash->rsi_gid);
+   seq_printf(m, "%u:%u\n", squash->rsi_uid, squash->rsi_gid);
+   return 0;
 }
 
-static ssize_t root_squash_store(struct kobject *kobj, struct attribute *attr,
-const char *buffer, size_t count)
+static ssize_t ll_root_squash_seq_write(struct file *file,
+   const char __user *buffer,
+   size_t count, loff_t *off)
 {
-   struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info,
- ll_kobj);
+   struct seq_file *m = file->private_data;
+   struct super_block *sb = m->private;
+   struct ll_sb_info *sbi = ll_s2sbi(sb);
struct root_squash_info *squash = >ll_squash;
 
return lprocfs_wr_root_squash(buffer, count, squash,
- ll_get_fsname(sbi->ll_sb, NULL, 0));
+ ll_get_fsname(sb, NULL, 0));
 }
-LUSTRE_RW_ATTR(root_squash);
+LPROC_SEQ_FOPS(ll_root_squash);
 
 static int ll_nosquash_nids_seq_show(struct seq_file *m, void *v)
 {
@@ -997,6 +999,8 @@ static struct lprocfs_vars lprocfs_llite_obd_vars[] = {
{ "statahead_stats",  _statahead_stats_fops, NULL, 0 },
{ "unstable_stats",   _unstable_stats_fops, NULL },
{ "sbi_flags",_sbi_flags_fops, NULL, 0 },
+   { .name =   "root_squash",
+ .fops =   _root_squash_fops},
{ .name =   "nosquash_nids",
  .fops =   _nosquash_nids_fops  },
{ NULL }
@@ -1027,7 +1031,6 @@ static struct attribute *llite_attrs[] = {
_attr_max_easize.attr,
_attr_default_easize.attr,
_attr_xattr_cache.attr,
-   _attr_root_squash.attr,
NULL,
 };
 
-- 
2.7.4



[PATCH 2/8] staging/lustre/ldlm: Correct itree_overlap_cb return type

2016-12-07 Thread Oleg Drokin
As per interval_search() prototype, the callback should return
enum, not int.
This fixes correspondign sparse warning.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index a4a291a..f4cbc89 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -1148,7 +1148,7 @@ static int lock_matches(struct ldlm_lock *lock, struct 
lock_match_data *data)
return INTERVAL_ITER_STOP;
 }
 
-static unsigned int itree_overlap_cb(struct interval_node *in, void *args)
+static enum interval_iter itree_overlap_cb(struct interval_node *in, void 
*args)
 {
struct ldlm_interval *node = to_ldlm_interval(in);
struct lock_match_data *data = args;
-- 
2.7.4



[PATCH 5/8] staging/lustre/osc: extern declare osc_caches in a header

2016-12-07 Thread Oleg Drokin
This avoids frowned upon extern in the C file, and also
shuts down a sparse warning of
drivers/staging/lustre/lustre/osc/osc_dev.c:55:22: warning: symbol 'osc_caches' 
was not declared. Should it be static?

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/osc/osc_internal.h | 2 ++
 drivers/staging/lustre/lustre/osc/osc_request.c  | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h 
b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 688783d..5cce82b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -181,6 +181,8 @@ static inline struct osc_device *obd2osc_dev(const struct 
obd_device *d)
return container_of0(d->obd_lu_dev, struct osc_device, od_cl.cd_lu_dev);
 }
 
+extern struct lu_kmem_descr osc_caches[];
+
 extern struct kmem_cache *osc_quota_kmem;
 struct osc_quota_info {
/** linkage for quota hash table */
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c 
b/drivers/staging/lustre/lustre/osc/osc_request.c
index 7143564..f691297 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2766,8 +2766,6 @@ static struct obd_ops osc_obd_ops = {
.quotactl   = osc_quotactl,
 };
 
-extern struct lu_kmem_descr osc_caches[];
-
 static int __init osc_init(void)
 {
struct lprocfs_static_vars lvars = { NULL };
-- 
2.7.4



[PATCH 5/8] staging/lustre/osc: extern declare osc_caches in a header

2016-12-07 Thread Oleg Drokin
This avoids frowned upon extern in the C file, and also
shuts down a sparse warning of
drivers/staging/lustre/lustre/osc/osc_dev.c:55:22: warning: symbol 'osc_caches' 
was not declared. Should it be static?

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/osc/osc_internal.h | 2 ++
 drivers/staging/lustre/lustre/osc/osc_request.c  | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h 
b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 688783d..5cce82b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -181,6 +181,8 @@ static inline struct osc_device *obd2osc_dev(const struct 
obd_device *d)
return container_of0(d->obd_lu_dev, struct osc_device, od_cl.cd_lu_dev);
 }
 
+extern struct lu_kmem_descr osc_caches[];
+
 extern struct kmem_cache *osc_quota_kmem;
 struct osc_quota_info {
/** linkage for quota hash table */
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c 
b/drivers/staging/lustre/lustre/osc/osc_request.c
index 7143564..f691297 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2766,8 +2766,6 @@ static struct obd_ops osc_obd_ops = {
.quotactl   = osc_quotactl,
 };
 
-extern struct lu_kmem_descr osc_caches[];
-
 static int __init osc_init(void)
 {
struct lprocfs_static_vars lvars = { NULL };
-- 
2.7.4



[PATCH 7/8] staging/lustre: Move lov_read_and_clear_async_rc declaration

2016-12-07 Thread Oleg Drokin
Move it to obd.h, so that it's included from both the users and
the actual definition, making sure they never get out of sync.
This also silences a sparse warning.

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/include/obd.h| 3 +++
 drivers/staging/lustre/lustre/llite/vvp_internal.h | 2 --
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index 0f48e9c..1839f4f 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -43,6 +43,7 @@
 #include "lustre_fld.h"
 #include "lustre_handles.h"
 #include "lustre_intent.h"
+#include "cl_object.h"
 
 #define MAX_OBD_DEVICES 8192
 
@@ -76,6 +77,8 @@ static inline void loi_init(struct lov_oinfo *loi)
 struct lov_stripe_md;
 struct obd_info;
 
+int lov_read_and_clear_async_rc(struct cl_object *clob);
+
 typedef int (*obd_enqueue_update_f)(void *cookie, int rc);
 
 /* obd info for a particular level (lov, osc). */
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h 
b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index c60d041..f40fd7f 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -301,8 +301,6 @@ static inline struct vvp_lock *cl2vvp_lock(const struct 
cl_lock_slice *slice)
 # define CLOBINVRNT(env, clob, expr)   \
((void)sizeof(env), (void)sizeof(clob), (void)sizeof(!!(expr)))
 
-int lov_read_and_clear_async_rc(struct cl_object *clob);
-
 int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
struct cl_io *io);
 int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io);
-- 
2.7.4



[PATCH 6/8] staging/lustre: Declare lu_context/session_tags_default

2016-12-07 Thread Oleg Drokin
Make the declaration in a header, not as an extern in a C file,
that is frowned upon.
This also makes sparse a little bit more happy.

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/include/lu_object.h  | 3 +++
 drivers/staging/lustre/lustre/obdclass/cl_object.c | 3 +--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h 
b/drivers/staging/lustre/lustre/include/lu_object.h
index 260643e..69b2812 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -1326,5 +1326,8 @@ void lu_buf_realloc(struct lu_buf *buf, size_t size);
 int lu_buf_check_and_grow(struct lu_buf *buf, size_t len);
 struct lu_buf *lu_buf_check_and_alloc(struct lu_buf *buf, size_t len);
 
+extern __u32 lu_context_tags_default;
+extern __u32 lu_session_tags_default;
+
 /** @} lu */
 #endif /* __LUSTRE_LU_OBJECT_H */
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c 
b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index f5d4e23..703cb67 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -54,6 +54,7 @@
 #include 
 #include "../../include/linux/libcfs/libcfs_hash.h"/* for cfs_hash stuff */
 #include "../include/cl_object.h"
+#include "../include/lu_object.h"
 #include "cl_internal.h"
 
 static struct kmem_cache *cl_env_kmem;
@@ -61,8 +62,6 @@ static struct kmem_cache *cl_env_kmem;
 /** Lock class of cl_object_header::coh_attr_guard */
 static struct lock_class_key cl_attr_guard_class;
 
-extern __u32 lu_context_tags_default;
-extern __u32 lu_session_tags_default;
 /**
  * Initialize cl_object_header.
  */
-- 
2.7.4



[PATCH 0/8] Sparse warning fixes in Lustre.

2016-12-07 Thread Oleg Drokin
This set of fixes aims at sparse warnings.
Most of the patches are just moving declarations around
to deal with the
warning: symbol 'xxx' was not declared. Should it be static?
kind of messages.

Also a screwup with root_squash sysfs control is fixed.

Oleg Drokin (8):
  staging/lustre/llite: move root_squash from sysfs to debugfs
  staging/lustre/ldlm: Correct itree_overlap_cb return type
  staging/lustre/llite: mark ll_io_init() static
  staging/lustre/lov: make lov_lsm_alloc() static
  staging/lustre/osc: extern declare osc_caches in a header
  staging/lustre: Declare lu_context/session_tags_default
  staging/lustre: Move lov_read_and_clear_async_rc declaration
  staging/lustre/ptlrpc: Move nrs_conf_fifo extern to a header

 drivers/staging/lustre/lustre/include/lu_object.h  |  3 +++
 drivers/staging/lustre/lustre/include/obd.h|  3 +++
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c |  2 +-
 drivers/staging/lustre/lustre/llite/file.c |  2 +-
 drivers/staging/lustre/lustre/llite/lproc_llite.c  | 27 --
 drivers/staging/lustre/lustre/llite/vvp_internal.h |  2 --
 drivers/staging/lustre/lustre/lov/lov_pack.c   |  3 ++-
 drivers/staging/lustre/lustre/obdclass/cl_object.c |  3 +--
 drivers/staging/lustre/lustre/osc/osc_internal.h   |  2 ++
 drivers/staging/lustre/lustre/osc/osc_request.c|  2 --
 drivers/staging/lustre/lustre/ptlrpc/nrs.c |  3 ---
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |  3 +++
 12 files changed, 31 insertions(+), 24 deletions(-)

-- 
2.7.4



[PATCH 8/8] staging/lustre/ptlrpc: Move nrs_conf_fifo extern to a header

2016-12-07 Thread Oleg Drokin
This avoids having an extern definition in a C file which is bad,
and also silences sparse complaint as well.

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/ptlrpc/nrs.c | 3 ---
 drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h | 3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/nrs.c 
b/drivers/staging/lustre/lustre/ptlrpc/nrs.c
index 7b6ffb1..ef19dbe 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/nrs.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/nrs.c
@@ -1559,9 +1559,6 @@ int ptlrpc_nrs_policy_control(const struct ptlrpc_service 
*svc,
return rc;
 }
 
-/* ptlrpc/nrs_fifo.c */
-extern struct ptlrpc_nrs_pol_conf nrs_conf_fifo;
-
 /**
  * Adds all policies that ship with the ptlrpc module, to NRS core's list of
  * policies \e nrs_core.nrs_policies.
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h 
b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
index e0f859c..8e6a805 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
@@ -226,6 +226,9 @@ struct ptlrpc_nrs_policy *nrs_request_policy(struct 
ptlrpc_nrs_request *nrq)
  sizeof(NRS_LPROCFS_QUANTUM_NAME_REG __stringify(LPROCFS_NRS_QUANTUM_MAX) " "  
\
NRS_LPROCFS_QUANTUM_NAME_HP __stringify(LPROCFS_NRS_QUANTUM_MAX))
 
+/* ptlrpc/nrs_fifo.c */
+extern struct ptlrpc_nrs_pol_conf nrs_conf_fifo;
+
 /* recovd_thread.c */
 
 int ptlrpc_expire_one_request(struct ptlrpc_request *req, int async_unlink);
-- 
2.7.4



[PATCH 7/8] staging/lustre: Move lov_read_and_clear_async_rc declaration

2016-12-07 Thread Oleg Drokin
Move it to obd.h, so that it's included from both the users and
the actual definition, making sure they never get out of sync.
This also silences a sparse warning.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/obd.h| 3 +++
 drivers/staging/lustre/lustre/llite/vvp_internal.h | 2 --
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index 0f48e9c..1839f4f 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -43,6 +43,7 @@
 #include "lustre_fld.h"
 #include "lustre_handles.h"
 #include "lustre_intent.h"
+#include "cl_object.h"
 
 #define MAX_OBD_DEVICES 8192
 
@@ -76,6 +77,8 @@ static inline void loi_init(struct lov_oinfo *loi)
 struct lov_stripe_md;
 struct obd_info;
 
+int lov_read_and_clear_async_rc(struct cl_object *clob);
+
 typedef int (*obd_enqueue_update_f)(void *cookie, int rc);
 
 /* obd info for a particular level (lov, osc). */
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h 
b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index c60d041..f40fd7f 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -301,8 +301,6 @@ static inline struct vvp_lock *cl2vvp_lock(const struct 
cl_lock_slice *slice)
 # define CLOBINVRNT(env, clob, expr)   \
((void)sizeof(env), (void)sizeof(clob), (void)sizeof(!!(expr)))
 
-int lov_read_and_clear_async_rc(struct cl_object *clob);
-
 int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
struct cl_io *io);
 int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io);
-- 
2.7.4



[PATCH 6/8] staging/lustre: Declare lu_context/session_tags_default

2016-12-07 Thread Oleg Drokin
Make the declaration in a header, not as an extern in a C file,
that is frowned upon.
This also makes sparse a little bit more happy.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/lu_object.h  | 3 +++
 drivers/staging/lustre/lustre/obdclass/cl_object.c | 3 +--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h 
b/drivers/staging/lustre/lustre/include/lu_object.h
index 260643e..69b2812 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -1326,5 +1326,8 @@ void lu_buf_realloc(struct lu_buf *buf, size_t size);
 int lu_buf_check_and_grow(struct lu_buf *buf, size_t len);
 struct lu_buf *lu_buf_check_and_alloc(struct lu_buf *buf, size_t len);
 
+extern __u32 lu_context_tags_default;
+extern __u32 lu_session_tags_default;
+
 /** @} lu */
 #endif /* __LUSTRE_LU_OBJECT_H */
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c 
b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index f5d4e23..703cb67 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -54,6 +54,7 @@
 #include 
 #include "../../include/linux/libcfs/libcfs_hash.h"/* for cfs_hash stuff */
 #include "../include/cl_object.h"
+#include "../include/lu_object.h"
 #include "cl_internal.h"
 
 static struct kmem_cache *cl_env_kmem;
@@ -61,8 +62,6 @@ static struct kmem_cache *cl_env_kmem;
 /** Lock class of cl_object_header::coh_attr_guard */
 static struct lock_class_key cl_attr_guard_class;
 
-extern __u32 lu_context_tags_default;
-extern __u32 lu_session_tags_default;
 /**
  * Initialize cl_object_header.
  */
-- 
2.7.4



[PATCH 0/8] Sparse warning fixes in Lustre.

2016-12-07 Thread Oleg Drokin
This set of fixes aims at sparse warnings.
Most of the patches are just moving declarations around
to deal with the
warning: symbol 'xxx' was not declared. Should it be static?
kind of messages.

Also a screwup with root_squash sysfs control is fixed.

Oleg Drokin (8):
  staging/lustre/llite: move root_squash from sysfs to debugfs
  staging/lustre/ldlm: Correct itree_overlap_cb return type
  staging/lustre/llite: mark ll_io_init() static
  staging/lustre/lov: make lov_lsm_alloc() static
  staging/lustre/osc: extern declare osc_caches in a header
  staging/lustre: Declare lu_context/session_tags_default
  staging/lustre: Move lov_read_and_clear_async_rc declaration
  staging/lustre/ptlrpc: Move nrs_conf_fifo extern to a header

 drivers/staging/lustre/lustre/include/lu_object.h  |  3 +++
 drivers/staging/lustre/lustre/include/obd.h|  3 +++
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c |  2 +-
 drivers/staging/lustre/lustre/llite/file.c |  2 +-
 drivers/staging/lustre/lustre/llite/lproc_llite.c  | 27 --
 drivers/staging/lustre/lustre/llite/vvp_internal.h |  2 --
 drivers/staging/lustre/lustre/lov/lov_pack.c   |  3 ++-
 drivers/staging/lustre/lustre/obdclass/cl_object.c |  3 +--
 drivers/staging/lustre/lustre/osc/osc_internal.h   |  2 ++
 drivers/staging/lustre/lustre/osc/osc_request.c|  2 --
 drivers/staging/lustre/lustre/ptlrpc/nrs.c |  3 ---
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |  3 +++
 12 files changed, 31 insertions(+), 24 deletions(-)

-- 
2.7.4



[PATCH 8/8] staging/lustre/ptlrpc: Move nrs_conf_fifo extern to a header

2016-12-07 Thread Oleg Drokin
This avoids having an extern definition in a C file which is bad,
and also silences sparse complaint as well.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ptlrpc/nrs.c | 3 ---
 drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h | 3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/nrs.c 
b/drivers/staging/lustre/lustre/ptlrpc/nrs.c
index 7b6ffb1..ef19dbe 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/nrs.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/nrs.c
@@ -1559,9 +1559,6 @@ int ptlrpc_nrs_policy_control(const struct ptlrpc_service 
*svc,
return rc;
 }
 
-/* ptlrpc/nrs_fifo.c */
-extern struct ptlrpc_nrs_pol_conf nrs_conf_fifo;
-
 /**
  * Adds all policies that ship with the ptlrpc module, to NRS core's list of
  * policies \e nrs_core.nrs_policies.
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h 
b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
index e0f859c..8e6a805 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
@@ -226,6 +226,9 @@ struct ptlrpc_nrs_policy *nrs_request_policy(struct 
ptlrpc_nrs_request *nrq)
  sizeof(NRS_LPROCFS_QUANTUM_NAME_REG __stringify(LPROCFS_NRS_QUANTUM_MAX) " "  
\
NRS_LPROCFS_QUANTUM_NAME_HP __stringify(LPROCFS_NRS_QUANTUM_MAX))
 
+/* ptlrpc/nrs_fifo.c */
+extern struct ptlrpc_nrs_pol_conf nrs_conf_fifo;
+
 /* recovd_thread.c */
 
 int ptlrpc_expire_one_request(struct ptlrpc_request *req, int async_unlink);
-- 
2.7.4



Re: [lustre-devel] [PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 3:37 PM, Greg Kroah-Hartman wrote:

> On Wed, Dec 07, 2016 at 11:29:36AM -0500, Oleg Drokin wrote:
>> 
>> On Dec 7, 2016, at 5:40 AM, Greg Kroah-Hartman wrote:
>> 
>>> On Tue, Dec 06, 2016 at 10:53:48PM -0500, Oleg Drokin wrote:
>>>> I have been having a lot of unexplainable crashes in osc_lru_shrink
>>>> lately that I could not see a good explanation for and then I found
>>>> this patch that slip under the radar somehow that incorrectly
>>>> converted while loop for lru list iteration into
>>>> list_for_each_entry_safe totally ignoring that in the body of
>>>> the loop we drop spinlocks guarding this list and move list entries
>>>> around.
>>>> Not sure why it was not showing up right away, perhaps some of the
>>>> more recent LRU changes committed caused some extra pressure on this
>>>> code that finally highlighted the breakage.
>>>> 
>>>> Reverts: 8adddc36b1fc ("staging: lustre: osc: Use 
>>>> list_for_each_entry_safe")
>>>> CC: Bhaktipriya Shridhar <bhaktipriy...@gmail.com>
>>>> Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
>>>> ---
>>>> I also do not see this patch in any of the mailing lists I am subscribed 
>>>> to.
>>>> I wonder if there's a way to subscribe to those Greg's
>>>> "This is a note to let you know that I've just added the patch "
>>>> emails that concern Lustre to get them even if I am not on the CC list in
>>>> the patch itself?
>>> 
>>> This came in from the Outreacy application process, which now requires
>>> that they cc: the maintainers to catch this type of issue.  So you
>>> should have seen these types of patches this last round, the commit you
>>> reference was done before that change happened, sorry.
>> 
>> Do you know approximate date range of when these patches ere sneaking in?
> 
> Anytime before a few months ago.

Ugh, I see.

>> I'd like to take a look at the rest of it proactively just to see if there 
>> are
>> more undiscovered surprises?
> 
> If your testing isn't finding any problems, all should be good, right?
> :)

I see processes hanging waiting for RPC response (rarely) that is very 
suspicious,
but I did not get to the root of it yet.
Also my test system is limited in capacity, they don't let me anywhere near 
those
TOP100 systems with the staging client ;)




Re: [lustre-devel] [PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 3:37 PM, Greg Kroah-Hartman wrote:

> On Wed, Dec 07, 2016 at 11:29:36AM -0500, Oleg Drokin wrote:
>> 
>> On Dec 7, 2016, at 5:40 AM, Greg Kroah-Hartman wrote:
>> 
>>> On Tue, Dec 06, 2016 at 10:53:48PM -0500, Oleg Drokin wrote:
>>>> I have been having a lot of unexplainable crashes in osc_lru_shrink
>>>> lately that I could not see a good explanation for and then I found
>>>> this patch that slip under the radar somehow that incorrectly
>>>> converted while loop for lru list iteration into
>>>> list_for_each_entry_safe totally ignoring that in the body of
>>>> the loop we drop spinlocks guarding this list and move list entries
>>>> around.
>>>> Not sure why it was not showing up right away, perhaps some of the
>>>> more recent LRU changes committed caused some extra pressure on this
>>>> code that finally highlighted the breakage.
>>>> 
>>>> Reverts: 8adddc36b1fc ("staging: lustre: osc: Use 
>>>> list_for_each_entry_safe")
>>>> CC: Bhaktipriya Shridhar 
>>>> Signed-off-by: Oleg Drokin 
>>>> ---
>>>> I also do not see this patch in any of the mailing lists I am subscribed 
>>>> to.
>>>> I wonder if there's a way to subscribe to those Greg's
>>>> "This is a note to let you know that I've just added the patch "
>>>> emails that concern Lustre to get them even if I am not on the CC list in
>>>> the patch itself?
>>> 
>>> This came in from the Outreacy application process, which now requires
>>> that they cc: the maintainers to catch this type of issue.  So you
>>> should have seen these types of patches this last round, the commit you
>>> reference was done before that change happened, sorry.
>> 
>> Do you know approximate date range of when these patches ere sneaking in?
> 
> Anytime before a few months ago.

Ugh, I see.

>> I'd like to take a look at the rest of it proactively just to see if there 
>> are
>> more undiscovered surprises?
> 
> If your testing isn't finding any problems, all should be good, right?
> :)

I see processes hanging waiting for RPC response (rarely) that is very 
suspicious,
but I did not get to the root of it yet.
Also my test system is limited in capacity, they don't let me anywhere near 
those
TOP100 systems with the staging client ;)




Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 10:20 AM, Quentin Lambert wrote:

> Hi all,
> 
> I am looking at the drivers/staging/lustre/lustre/llite/dir.c:
> 
> 1469 /* Call mdc_iocontrol */
> 1470 rc = obd_iocontrol(LL_IOC_FID2MDTIDX, exp, sizeof(fid), 
> ,
> 1471);
> 1472 if (rc)
> 
> and sparse says:
> 
> drivers/staging/lustre/lustre/llite/dir.c:1471:37: warning: incorrect type in 
> argument 5 (different address spaces)
> 
> I was wondering if there was any value to add a cast to fix the warning?

These's a sister warning to this one, btw, in
drivers/staging/lustre/lustre/lmv/lmv_obd.c:996:19: warning: cast removes 
address space of expression

It is an ugly kludge and I guess needs to just be reworked somehow instead to 
avoid
these ugly games.



Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 10:20 AM, Quentin Lambert wrote:

> Hi all,
> 
> I am looking at the drivers/staging/lustre/lustre/llite/dir.c:
> 
> 1469 /* Call mdc_iocontrol */
> 1470 rc = obd_iocontrol(LL_IOC_FID2MDTIDX, exp, sizeof(fid), 
> ,
> 1471);
> 1472 if (rc)
> 
> and sparse says:
> 
> drivers/staging/lustre/lustre/llite/dir.c:1471:37: warning: incorrect type in 
> argument 5 (different address spaces)
> 
> I was wondering if there was any value to add a cast to fix the warning?

These's a sister warning to this one, btw, in
drivers/staging/lustre/lustre/lmv/lmv_obd.c:996:19: warning: cast removes 
address space of expression

It is an ugly kludge and I guess needs to just be reworked somehow instead to 
avoid
these ugly games.



Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 10:33 AM, Dan Carpenter wrote:

> Lustre is kind of a mess with regards to keeping user and kernel
> pointers separate.  It's not going to be easy to fix.

Actually I believe I made significant inroads in properly cleaning (almost?) 
everything
in this area about a year ago (to the point that only false positives were 
left).
I guess some more stuff crept in, I'll just make another run through and see
what else I can improve.


Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 10:33 AM, Dan Carpenter wrote:

> Lustre is kind of a mess with regards to keeping user and kernel
> pointers separate.  It's not going to be easy to fix.

Actually I believe I made significant inroads in properly cleaning (almost?) 
everything
in this area about a year ago (to the point that only false positives were 
left).
I guess some more stuff crept in, I'll just make another run through and see
what else I can improve.


Re: [PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 5:40 AM, Greg Kroah-Hartman wrote:

> On Tue, Dec 06, 2016 at 10:53:48PM -0500, Oleg Drokin wrote:
>> I have been having a lot of unexplainable crashes in osc_lru_shrink
>> lately that I could not see a good explanation for and then I found
>> this patch that slip under the radar somehow that incorrectly
>> converted while loop for lru list iteration into
>> list_for_each_entry_safe totally ignoring that in the body of
>> the loop we drop spinlocks guarding this list and move list entries
>> around.
>> Not sure why it was not showing up right away, perhaps some of the
>> more recent LRU changes committed caused some extra pressure on this
>> code that finally highlighted the breakage.
>> 
>> Reverts: 8adddc36b1fc ("staging: lustre: osc: Use list_for_each_entry_safe")
>> CC: Bhaktipriya Shridhar <bhaktipriy...@gmail.com>
>> Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
>> ---
>> I also do not see this patch in any of the mailing lists I am subscribed to.
>> I wonder if there's a way to subscribe to those Greg's
>> "This is a note to let you know that I've just added the patch "
>> emails that concern Lustre to get them even if I am not on the CC list in
>> the patch itself?
> 
> This came in from the Outreacy application process, which now requires
> that they cc: the maintainers to catch this type of issue.  So you
> should have seen these types of patches this last round, the commit you
> reference was done before that change happened, sorry.

Do you know approximate date range of when these patches ere sneaking in?
I'd like to take a look at the rest of it proactively just to see if there are
more undiscovered surprises?

> This change should go to stable kernels, so I'll mark it that way.

Thanks!


Re: [PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 5:40 AM, Greg Kroah-Hartman wrote:

> On Tue, Dec 06, 2016 at 10:53:48PM -0500, Oleg Drokin wrote:
>> I have been having a lot of unexplainable crashes in osc_lru_shrink
>> lately that I could not see a good explanation for and then I found
>> this patch that slip under the radar somehow that incorrectly
>> converted while loop for lru list iteration into
>> list_for_each_entry_safe totally ignoring that in the body of
>> the loop we drop spinlocks guarding this list and move list entries
>> around.
>> Not sure why it was not showing up right away, perhaps some of the
>> more recent LRU changes committed caused some extra pressure on this
>> code that finally highlighted the breakage.
>> 
>> Reverts: 8adddc36b1fc ("staging: lustre: osc: Use list_for_each_entry_safe")
>> CC: Bhaktipriya Shridhar 
>> Signed-off-by: Oleg Drokin 
>> ---
>> I also do not see this patch in any of the mailing lists I am subscribed to.
>> I wonder if there's a way to subscribe to those Greg's
>> "This is a note to let you know that I've just added the patch "
>> emails that concern Lustre to get them even if I am not on the CC list in
>> the patch itself?
> 
> This came in from the Outreacy application process, which now requires
> that they cc: the maintainers to catch this type of issue.  So you
> should have seen these types of patches this last round, the commit you
> reference was done before that change happened, sorry.

Do you know approximate date range of when these patches ere sneaking in?
I'd like to take a look at the rest of it proactively just to see if there are
more undiscovered surprises?

> This change should go to stable kernels, so I'll mark it that way.

Thanks!


[PATCH] staging/lustre/lnetselftest: Fix potential integer overflow

2016-12-06 Thread Oleg Drokin
It looks like if the passed in parameter is not present, but
parameter length is non zero, then sanity checks on the length
are skipped and lstcon_test_add() might then use incorrect
allocation that's prone to integer overflow size.

This patch ensures that parameter len is zero if parameter is
not present.

Reported-by: Dan Carpenter <dan.carpen...@oracle.com>
Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lnet/selftest/conctl.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c 
b/drivers/staging/lustre/lnet/selftest/conctl.c
index 02847bf..9438302 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -742,6 +742,10 @@ static int lst_test_add_ioctl(lstio_test_args_t *args)
 PAGE_SIZE - sizeof(struct lstcon_test)))
return -EINVAL;
 
+   /* Enforce zero parameter length if there's no parameter */
+   if (!args->lstio_tes_param && args->lstio_tes_param_len)
+   return -EINVAL;
+
LIBCFS_ALLOC(batch_name, args->lstio_tes_bat_nmlen + 1);
if (!batch_name)
return rc;
-- 
2.7.4



[PATCH] staging/lustre/lnetselftest: Fix potential integer overflow

2016-12-06 Thread Oleg Drokin
It looks like if the passed in parameter is not present, but
parameter length is non zero, then sanity checks on the length
are skipped and lstcon_test_add() might then use incorrect
allocation that's prone to integer overflow size.

This patch ensures that parameter len is zero if parameter is
not present.

Reported-by: Dan Carpenter 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/selftest/conctl.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c 
b/drivers/staging/lustre/lnet/selftest/conctl.c
index 02847bf..9438302 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -742,6 +742,10 @@ static int lst_test_add_ioctl(lstio_test_args_t *args)
 PAGE_SIZE - sizeof(struct lstcon_test)))
return -EINVAL;
 
+   /* Enforce zero parameter length if there's no parameter */
+   if (!args->lstio_tes_param && args->lstio_tes_param_len)
+   return -EINVAL;
+
LIBCFS_ALLOC(batch_name, args->lstio_tes_bat_nmlen + 1);
if (!batch_name)
return rc;
-- 
2.7.4



[PATCH 5/5] staging/lustre/o2iblnd: Fix misspelled attemps->attempts

2016-12-06 Thread Oleg Drokin
Highlighted by checkpatch:
WARNING: 'attemps' may be misspelled - perhaps 'attempts'?
#20278: FILE: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c:3272:
+ * reconnection attemps.

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index bea408d..c7917ab 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3269,7 +3269,7 @@ kiblnd_disconnect_conn(struct kib_conn *conn)
 #define KIB_RECONN_HIGH_RACE   10
 /**
  * Allow connd to take a break and handle other things after consecutive
- * reconnection attemps.
+ * reconnection attempts.
  */
 #define KIB_RECONN_BREAK   100
 
-- 
2.7.4



[PATCH 3/5] staging/lustre: Convert all bare unsigned to unsigned int

2016-12-06 Thread Oleg Drokin
Highlighted by relatively new checkpatch test, warnings like:
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/include/linux/lnet/lnetst.h |  6 +-
 .../staging/lustre/lustre/include/lprocfs_status.h |  3 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |  8 +-
 drivers/staging/lustre/lustre/llite/llite_nfs.c|  2 +-
 drivers/staging/lustre/lustre/llite/rw26.c |  4 +-
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  6 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c   |  3 +-
 .../lustre/lustre/obdclass/lprocfs_status.c|  3 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |  6 +-
 .../staging/lustre/lustre/obdclass/obd_config.c|  4 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c   |  2 +-
 drivers/staging/lustre/lustre/osc/osc_quota.c  |  4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c|  6 +-
 drivers/staging/lustre/lustre/ptlrpc/connection.c  |  4 +-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c|  4 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c |  6 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c| 92 +++---
 17 files changed, 83 insertions(+), 80 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lnetst.h 
b/drivers/staging/lustre/include/linux/lnet/lnetst.h
index 78f825d..8a84888 100644
--- a/drivers/staging/lustre/include/linux/lnet/lnetst.h
+++ b/drivers/staging/lustre/include/linux/lnet/lnetst.h
@@ -244,7 +244,7 @@ typedef struct {
int  lstio_ses_timeout; /* IN: session timeout */
int  lstio_ses_force;   /* IN: force create ? */
/** IN: session features */
-   unsigned lstio_ses_feats;
+   unsigned int lstio_ses_feats;
lst_sid_t __user *lstio_ses_idp;/* OUT: session id */
int  lstio_ses_nmlen;   /* IN: name length */
char __user  *lstio_ses_namep;  /* IN: session name */
@@ -255,7 +255,7 @@ typedef struct {
lst_sid_t __user*lstio_ses_idp; /* OUT: session id */
int __user  *lstio_ses_keyp;/* OUT: local key */
/** OUT: session features */
-   unsigned __user *lstio_ses_featp;
+   unsigned int __user *lstio_ses_featp;
lstcon_ndlist_ent_t __user *lstio_ses_ndinfo;   /* OUT: */
int  lstio_ses_nmlen;   /* IN: name length */
char __user *lstio_ses_namep;   /* OUT: session name */
@@ -328,7 +328,7 @@ typedef struct {
char __user *lstio_grp_namep;   /* IN: group name */
int  lstio_grp_count;   /* IN: # of nodes */
/** OUT: session features */
-   unsigned __user *lstio_grp_featp;
+   unsigned int __user *lstio_grp_featp;
lnet_process_id_t __user *lstio_grp_idsp;   /* IN: nodes */
struct list_head __user *lstio_grp_resultp; /* OUT: list head of
result buffer */
diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h 
b/drivers/staging/lustre/lustre/include/lprocfs_status.h
index adef2d2..62753da 100644
--- a/drivers/staging/lustre/lustre/include/lprocfs_status.h
+++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h
@@ -542,7 +542,8 @@ lprocfs_alloc_stats(unsigned int num, enum 
lprocfs_stats_flags flags);
 void lprocfs_clear_stats(struct lprocfs_stats *stats);
 void lprocfs_free_stats(struct lprocfs_stats **stats);
 void lprocfs_counter_init(struct lprocfs_stats *stats, int index,
- unsigned conf, const char *name, const char *units);
+ unsigned int conf, const char *name,
+ const char *units);
 struct obd_export;
 int lprocfs_exp_cleanup(struct obd_export *exp);
 struct dentry *ldebugfs_add_simple(struct dentry *root,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index 1095331..b22f5ba 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -445,8 +445,8 @@ static struct ldlm_resource *ldlm_resource_getref(struct 
ldlm_resource *res)
return res;
 }
 
-static unsigned ldlm_res_hop_hash(struct cfs_hash *hs,
- const void *key, unsigned mask)
+static unsigned int ldlm_res_hop_hash(struct cfs_hash *hs,
+ const void *key, unsigned int mask)
 {
const struct ldlm_res_id *id  = key;
unsigned intval = 0;
@@ -457,8 +457,8 @@ static unsigned ldlm_res_hop_hash(struct cfs_hash *hs,
return val & mask;
 }
 
-static unsigned ldlm_res_hop_fid_hash(struct cfs_hash *hs,
- const void *key, uns

[PATCH 5/5] staging/lustre/o2iblnd: Fix misspelled attemps->attempts

2016-12-06 Thread Oleg Drokin
Highlighted by checkpatch:
WARNING: 'attemps' may be misspelled - perhaps 'attempts'?
#20278: FILE: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c:3272:
+ * reconnection attemps.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index bea408d..c7917ab 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3269,7 +3269,7 @@ kiblnd_disconnect_conn(struct kib_conn *conn)
 #define KIB_RECONN_HIGH_RACE   10
 /**
  * Allow connd to take a break and handle other things after consecutive
- * reconnection attemps.
+ * reconnection attempts.
  */
 #define KIB_RECONN_BREAK   100
 
-- 
2.7.4



[PATCH 3/5] staging/lustre: Convert all bare unsigned to unsigned int

2016-12-06 Thread Oleg Drokin
Highlighted by relatively new checkpatch test, warnings like:
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/include/linux/lnet/lnetst.h |  6 +-
 .../staging/lustre/lustre/include/lprocfs_status.h |  3 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |  8 +-
 drivers/staging/lustre/lustre/llite/llite_nfs.c|  2 +-
 drivers/staging/lustre/lustre/llite/rw26.c |  4 +-
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  6 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c   |  3 +-
 .../lustre/lustre/obdclass/lprocfs_status.c|  3 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |  6 +-
 .../staging/lustre/lustre/obdclass/obd_config.c|  4 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c   |  2 +-
 drivers/staging/lustre/lustre/osc/osc_quota.c  |  4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c|  6 +-
 drivers/staging/lustre/lustre/ptlrpc/connection.c  |  4 +-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c|  4 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c |  6 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c| 92 +++---
 17 files changed, 83 insertions(+), 80 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lnetst.h 
b/drivers/staging/lustre/include/linux/lnet/lnetst.h
index 78f825d..8a84888 100644
--- a/drivers/staging/lustre/include/linux/lnet/lnetst.h
+++ b/drivers/staging/lustre/include/linux/lnet/lnetst.h
@@ -244,7 +244,7 @@ typedef struct {
int  lstio_ses_timeout; /* IN: session timeout */
int  lstio_ses_force;   /* IN: force create ? */
/** IN: session features */
-   unsigned lstio_ses_feats;
+   unsigned int lstio_ses_feats;
lst_sid_t __user *lstio_ses_idp;/* OUT: session id */
int  lstio_ses_nmlen;   /* IN: name length */
char __user  *lstio_ses_namep;  /* IN: session name */
@@ -255,7 +255,7 @@ typedef struct {
lst_sid_t __user*lstio_ses_idp; /* OUT: session id */
int __user  *lstio_ses_keyp;/* OUT: local key */
/** OUT: session features */
-   unsigned __user *lstio_ses_featp;
+   unsigned int __user *lstio_ses_featp;
lstcon_ndlist_ent_t __user *lstio_ses_ndinfo;   /* OUT: */
int  lstio_ses_nmlen;   /* IN: name length */
char __user *lstio_ses_namep;   /* OUT: session name */
@@ -328,7 +328,7 @@ typedef struct {
char __user *lstio_grp_namep;   /* IN: group name */
int  lstio_grp_count;   /* IN: # of nodes */
/** OUT: session features */
-   unsigned __user *lstio_grp_featp;
+   unsigned int __user *lstio_grp_featp;
lnet_process_id_t __user *lstio_grp_idsp;   /* IN: nodes */
struct list_head __user *lstio_grp_resultp; /* OUT: list head of
result buffer */
diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h 
b/drivers/staging/lustre/lustre/include/lprocfs_status.h
index adef2d2..62753da 100644
--- a/drivers/staging/lustre/lustre/include/lprocfs_status.h
+++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h
@@ -542,7 +542,8 @@ lprocfs_alloc_stats(unsigned int num, enum 
lprocfs_stats_flags flags);
 void lprocfs_clear_stats(struct lprocfs_stats *stats);
 void lprocfs_free_stats(struct lprocfs_stats **stats);
 void lprocfs_counter_init(struct lprocfs_stats *stats, int index,
- unsigned conf, const char *name, const char *units);
+ unsigned int conf, const char *name,
+ const char *units);
 struct obd_export;
 int lprocfs_exp_cleanup(struct obd_export *exp);
 struct dentry *ldebugfs_add_simple(struct dentry *root,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index 1095331..b22f5ba 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -445,8 +445,8 @@ static struct ldlm_resource *ldlm_resource_getref(struct 
ldlm_resource *res)
return res;
 }
 
-static unsigned ldlm_res_hop_hash(struct cfs_hash *hs,
- const void *key, unsigned mask)
+static unsigned int ldlm_res_hop_hash(struct cfs_hash *hs,
+ const void *key, unsigned int mask)
 {
const struct ldlm_res_id *id  = key;
unsigned intval = 0;
@@ -457,8 +457,8 @@ static unsigned ldlm_res_hop_hash(struct cfs_hash *hs,
return val & mask;
 }
 
-static unsigned ldlm_res_hop_fid_hash(struct cfs_hash *hs,
- const void *key, unsigned mask)
+static unsigned

[PATCH 2/5] staging/lustre/socklnd: Fix whitespace problem

2016-12-06 Thread Oleg Drokin
checkpatch highlighted there are 8 spaces that could be converted to a tab:
ERROR: code indent should use tabs where possible
+^I^I^I^I^I */$

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
index 2978014..842c453 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
@@ -334,7 +334,7 @@ struct ksock_conn {
atomic_t   ksnc_conn_refcount;/* conn refcount */
atomic_t   ksnc_sock_refcount;/* sock refcount */
struct ksock_sched *ksnc_scheduler; /* who schedules this connection
-*/
+*/
__u32  ksnc_myipaddr; /* my IP */
__u32  ksnc_ipaddr;   /* peer's IP */
intksnc_port; /* peer's port */
-- 
2.7.4



[PATCH 0/5] Lustre style fixes

2016-12-06 Thread Oleg Drokin
These patches fix some more of the low hanging fruits
in the style problems highlighted by checkpatch.
Now only false positive ERRORs are left.
This also converts all bare unsigneds into unsigned ints
and a couple of spelling fixes.

Please consider.

Oleg Drokin (5):
  staging/lustre/o2iblnd: Add missing space
  staging/lustre/socklnd: Fix whitespace problem
  staging/lustre: Convert all bare unsigned to unsigned int
  staging/lustre/o2iblnd: Fix misspelling intialized->intialized
  staging/lustre/o2iblnd: Fix misspelled attemps->attempts

 drivers/staging/lustre/include/linux/lnet/lnetst.h |  6 +-
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c|  4 +-
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c |  4 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd.h|  2 +-
 .../staging/lustre/lustre/include/lprocfs_status.h |  3 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |  8 +-
 drivers/staging/lustre/lustre/llite/llite_nfs.c|  2 +-
 drivers/staging/lustre/lustre/llite/rw26.c |  4 +-
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  6 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c   |  3 +-
 .../lustre/lustre/obdclass/lprocfs_status.c|  3 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |  6 +-
 .../staging/lustre/lustre/obdclass/obd_config.c|  4 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c   |  2 +-
 drivers/staging/lustre/lustre/osc/osc_quota.c  |  4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c|  6 +-
 drivers/staging/lustre/lustre/ptlrpc/connection.c  |  4 +-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c|  4 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c |  6 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c| 92 +++---
 20 files changed, 88 insertions(+), 85 deletions(-)

-- 
2.7.4



[PATCH 4/5] staging/lustre/o2iblnd: Fix misspelling intialized->intialized

2016-12-06 Thread Oleg Drokin
Highlighted by checkpatch:
+   if (!ps->ps_net) /* intialized? */

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index e2fc65f..7f761b3 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -1489,7 +1489,7 @@ static int kiblnd_create_fmr_pool(struct kib_fmr_poolset 
*fps,
 static void kiblnd_fail_fmr_poolset(struct kib_fmr_poolset *fps,
struct list_head *zombies)
 {
-   if (!fps->fps_net) /* intialized? */
+   if (!fps->fps_net) /* initialized? */
return;
 
spin_lock(>fps_lock);
@@ -1812,7 +1812,7 @@ static void kiblnd_destroy_pool_list(struct list_head 
*head)
 
 static void kiblnd_fail_poolset(struct kib_poolset *ps, struct list_head 
*zombies)
 {
-   if (!ps->ps_net) /* intialized? */
+   if (!ps->ps_net) /* initialized? */
return;
 
spin_lock(>ps_lock);
-- 
2.7.4



[PATCH 2/5] staging/lustre/socklnd: Fix whitespace problem

2016-12-06 Thread Oleg Drokin
checkpatch highlighted there are 8 spaces that could be converted to a tab:
ERROR: code indent should use tabs where possible
+^I^I^I^I^I */$

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
index 2978014..842c453 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
@@ -334,7 +334,7 @@ struct ksock_conn {
atomic_t   ksnc_conn_refcount;/* conn refcount */
atomic_t   ksnc_sock_refcount;/* sock refcount */
struct ksock_sched *ksnc_scheduler; /* who schedules this connection
-*/
+*/
__u32  ksnc_myipaddr; /* my IP */
__u32  ksnc_ipaddr;   /* peer's IP */
intksnc_port; /* peer's port */
-- 
2.7.4



[PATCH 0/5] Lustre style fixes

2016-12-06 Thread Oleg Drokin
These patches fix some more of the low hanging fruits
in the style problems highlighted by checkpatch.
Now only false positive ERRORs are left.
This also converts all bare unsigneds into unsigned ints
and a couple of spelling fixes.

Please consider.

Oleg Drokin (5):
  staging/lustre/o2iblnd: Add missing space
  staging/lustre/socklnd: Fix whitespace problem
  staging/lustre: Convert all bare unsigned to unsigned int
  staging/lustre/o2iblnd: Fix misspelling intialized->intialized
  staging/lustre/o2iblnd: Fix misspelled attemps->attempts

 drivers/staging/lustre/include/linux/lnet/lnetst.h |  6 +-
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c|  4 +-
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c |  4 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd.h|  2 +-
 .../staging/lustre/lustre/include/lprocfs_status.h |  3 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |  8 +-
 drivers/staging/lustre/lustre/llite/llite_nfs.c|  2 +-
 drivers/staging/lustre/lustre/llite/rw26.c |  4 +-
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  6 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c   |  3 +-
 .../lustre/lustre/obdclass/lprocfs_status.c|  3 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |  6 +-
 .../staging/lustre/lustre/obdclass/obd_config.c|  4 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c   |  2 +-
 drivers/staging/lustre/lustre/osc/osc_quota.c  |  4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c|  6 +-
 drivers/staging/lustre/lustre/ptlrpc/connection.c  |  4 +-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c|  4 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c |  6 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c| 92 +++---
 20 files changed, 88 insertions(+), 85 deletions(-)

-- 
2.7.4



[PATCH 4/5] staging/lustre/o2iblnd: Fix misspelling intialized->intialized

2016-12-06 Thread Oleg Drokin
Highlighted by checkpatch:
+   if (!ps->ps_net) /* intialized? */

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index e2fc65f..7f761b3 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -1489,7 +1489,7 @@ static int kiblnd_create_fmr_pool(struct kib_fmr_poolset 
*fps,
 static void kiblnd_fail_fmr_poolset(struct kib_fmr_poolset *fps,
struct list_head *zombies)
 {
-   if (!fps->fps_net) /* intialized? */
+   if (!fps->fps_net) /* initialized? */
return;
 
spin_lock(>fps_lock);
@@ -1812,7 +1812,7 @@ static void kiblnd_destroy_pool_list(struct list_head 
*head)
 
 static void kiblnd_fail_poolset(struct kib_poolset *ps, struct list_head 
*zombies)
 {
-   if (!ps->ps_net) /* intialized? */
+   if (!ps->ps_net) /* initialized? */
return;
 
spin_lock(>ps_lock);
-- 
2.7.4



[PATCH 1/5] staging/lustre/o2iblnd: Add missing space

2016-12-06 Thread Oleg Drokin
checkpatch highlighted missing space before assignment
for lock variable.

+   spinlock_t *lock= _data.kib_connd_lock;

Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 92692a2..bea408d 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3276,7 +3276,7 @@ kiblnd_disconnect_conn(struct kib_conn *conn)
 int
 kiblnd_connd(void *arg)
 {
-   spinlock_t *lock= _data.kib_connd_lock;
+   spinlock_t *lock = _data.kib_connd_lock;
wait_queue_t wait;
unsigned long flags;
struct kib_conn *conn;
-- 
2.7.4



[PATCH 1/5] staging/lustre/o2iblnd: Add missing space

2016-12-06 Thread Oleg Drokin
checkpatch highlighted missing space before assignment
for lock variable.

+   spinlock_t *lock= _data.kib_connd_lock;

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 92692a2..bea408d 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3276,7 +3276,7 @@ kiblnd_disconnect_conn(struct kib_conn *conn)
 int
 kiblnd_connd(void *arg)
 {
-   spinlock_t *lock= _data.kib_connd_lock;
+   spinlock_t *lock = _data.kib_connd_lock;
wait_queue_t wait;
unsigned long flags;
struct kib_conn *conn;
-- 
2.7.4



[PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-06 Thread Oleg Drokin
I have been having a lot of unexplainable crashes in osc_lru_shrink
lately that I could not see a good explanation for and then I found
this patch that slip under the radar somehow that incorrectly
converted while loop for lru list iteration into
list_for_each_entry_safe totally ignoring that in the body of
the loop we drop spinlocks guarding this list and move list entries
around.
Not sure why it was not showing up right away, perhaps some of the
more recent LRU changes committed caused some extra pressure on this
code that finally highlighted the breakage.

Reverts: 8adddc36b1fc ("staging: lustre: osc: Use list_for_each_entry_safe")
CC: Bhaktipriya Shridhar <bhaktipriy...@gmail.com>
Signed-off-by: Oleg Drokin <gr...@linuxhacker.ru>
---
I also do not see this patch in any of the mailing lists I am subscribed to.
I wonder if there's a way to subscribe to those Greg's
"This is a note to let you know that I've just added the patch "
emails that concern Lustre to get them even if I am not on the CC list in
the patch itself?

 drivers/staging/lustre/lustre/osc/osc_page.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c 
b/drivers/staging/lustre/lustre/osc/osc_page.c
index c5129d1..e356e4a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -537,7 +537,6 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
struct cl_object *clobj = NULL;
struct cl_page **pvec;
struct osc_page *opg;
-   struct osc_page *temp;
int maxscan = 0;
long count = 0;
int index = 0;
@@ -568,7 +567,7 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
if (force)
cli->cl_lru_reclaim++;
maxscan = min(target << 1, atomic_long_read(>cl_lru_in_list));
-   list_for_each_entry_safe(opg, temp, >cl_lru_list, ops_lru) {
+   while (!list_empty(>cl_lru_list)) {
struct cl_page *page;
bool will_free = false;
 
@@ -578,6 +577,8 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
if (--maxscan < 0)
break;
 
+   opg = list_entry(cli->cl_lru_list.next, struct osc_page,
+ops_lru);
page = opg->ops_cl.cpl_page;
if (lru_page_busy(cli, page)) {
list_move_tail(>ops_lru, >cl_lru_list);
-- 
2.7.4



[PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-06 Thread Oleg Drokin
I have been having a lot of unexplainable crashes in osc_lru_shrink
lately that I could not see a good explanation for and then I found
this patch that slip under the radar somehow that incorrectly
converted while loop for lru list iteration into
list_for_each_entry_safe totally ignoring that in the body of
the loop we drop spinlocks guarding this list and move list entries
around.
Not sure why it was not showing up right away, perhaps some of the
more recent LRU changes committed caused some extra pressure on this
code that finally highlighted the breakage.

Reverts: 8adddc36b1fc ("staging: lustre: osc: Use list_for_each_entry_safe")
CC: Bhaktipriya Shridhar 
Signed-off-by: Oleg Drokin 
---
I also do not see this patch in any of the mailing lists I am subscribed to.
I wonder if there's a way to subscribe to those Greg's
"This is a note to let you know that I've just added the patch "
emails that concern Lustre to get them even if I am not on the CC list in
the patch itself?

 drivers/staging/lustre/lustre/osc/osc_page.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c 
b/drivers/staging/lustre/lustre/osc/osc_page.c
index c5129d1..e356e4a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -537,7 +537,6 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
struct cl_object *clobj = NULL;
struct cl_page **pvec;
struct osc_page *opg;
-   struct osc_page *temp;
int maxscan = 0;
long count = 0;
int index = 0;
@@ -568,7 +567,7 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
if (force)
cli->cl_lru_reclaim++;
maxscan = min(target << 1, atomic_long_read(>cl_lru_in_list));
-   list_for_each_entry_safe(opg, temp, >cl_lru_list, ops_lru) {
+   while (!list_empty(>cl_lru_list)) {
struct cl_page *page;
bool will_free = false;
 
@@ -578,6 +577,8 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
if (--maxscan < 0)
break;
 
+   opg = list_entry(cli->cl_lru_list.next, struct osc_page,
+ops_lru);
page = opg->ops_cl.cpl_page;
if (lru_page_busy(cli, page)) {
list_move_tail(>ops_lru, >cl_lru_list);
-- 
2.7.4



Re: [PATCH 04/22] staging: lustre: osc: handle osc eviction correctly

2016-12-05 Thread Oleg Drokin

On Dec 5, 2016, at 3:55 PM, Dan Carpenter wrote:

> On Fri, Dec 02, 2016 at 07:53:11PM -0500, James Simmons wrote:
>> @@ -3183,8 +3182,10 @@ static int discard_cb(const struct lu_env *env, 
>> struct cl_io *io,
>>  /* page is top page. */
>>  info->oti_next_index = osc_index(ops) + 1;
>>  if (cl_page_own(env, io, page) == 0) {
>> -KLASSERT(ergo(page->cp_type == CPT_CACHEABLE,
>> -  !PageDirty(cl_page_vmpage(page;
>> +if (!ergo(page->cp_type == CPT_CACHEABLE,
>> +  !PageDirty(cl_page_vmpage(page
>> +CL_PAGE_DEBUG(D_ERROR, env, page,
>> +  "discard dirty page?\n");
> 
> 
> I don't understand the point of the ergo macro.  There are way too many
> double negatives (some of them hidden for my small brain).  How is that
> simpler than just writing it out:
> 
>   if (page->cp_type == CPT_CACHEABLE &&
>   PageDirty(cl_page_vmpage(page))
>CL_PAGE_DEBUG(D_ERROR, env, page, "discard dirty page?\n");

I guess it makes it sound chic or something?
I am not a huge fan of it either, esp. in a case like this, though
it might be somewhat more convenient in assertions (where this is converted 
from).


Re: [PATCH 04/22] staging: lustre: osc: handle osc eviction correctly

2016-12-05 Thread Oleg Drokin

On Dec 5, 2016, at 3:55 PM, Dan Carpenter wrote:

> On Fri, Dec 02, 2016 at 07:53:11PM -0500, James Simmons wrote:
>> @@ -3183,8 +3182,10 @@ static int discard_cb(const struct lu_env *env, 
>> struct cl_io *io,
>>  /* page is top page. */
>>  info->oti_next_index = osc_index(ops) + 1;
>>  if (cl_page_own(env, io, page) == 0) {
>> -KLASSERT(ergo(page->cp_type == CPT_CACHEABLE,
>> -  !PageDirty(cl_page_vmpage(page;
>> +if (!ergo(page->cp_type == CPT_CACHEABLE,
>> +  !PageDirty(cl_page_vmpage(page
>> +CL_PAGE_DEBUG(D_ERROR, env, page,
>> +  "discard dirty page?\n");
> 
> 
> I don't understand the point of the ergo macro.  There are way too many
> double negatives (some of them hidden for my small brain).  How is that
> simpler than just writing it out:
> 
>   if (page->cp_type == CPT_CACHEABLE &&
>   PageDirty(cl_page_vmpage(page))
>CL_PAGE_DEBUG(D_ERROR, env, page, "discard dirty page?\n");

I guess it makes it sound chic or something?
I am not a huge fan of it either, esp. in a case like this, though
it might be somewhat more convenient in assertions (where this is converted 
from).


Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-05 Thread Oleg Drokin

On Dec 2, 2016, at 12:33 PM, Quentin Lambert wrote:

> lnet_ipif_enumerate was assigning a pointer from kernel space to user
> space. This patch uses copy_to_user to properly do that assignment.

I guess it's a false positive?

While lnet_sock_ioctl()->kernel_sock_unlocked_ioctl() does call into the
f_op->unlocked_ioctl() with a userspace argument, note that we have
set_fs(KERNEL_DS); in there, therefore allowig copy_from_user
and friends to work on kernel data too as if it was userspace.
(I know it's ugly and we need to find a better way of getting this data,
but at least it's not incorrect).

> 
> Signed-off-by: Quentin Lambert 
> ---
> shouldn't we be using ifc_req instead of ifc_buf?
> 
> drivers/staging/lustre/lnet/lnet/lib-socket.c |8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
> 
> --- a/drivers/staging/lustre/lnet/lnet/lib-socket.c
> +++ b/drivers/staging/lustre/lnet/lnet/lib-socket.c
> @@ -181,7 +181,13 @@ lnet_ipif_enumerate(char ***namesp)
>   goto out0;
>   }
> 
> - ifc.ifc_buf = (char *)ifr;
> + rc = copy_to_user(ifc.ifc_buf, (char *)ifr,
> +   nalloc * sizeof(*ifr));
> + if (rc) {
> + rc = -ENOMEM;
> + goto out1;
> + }
> +
>   ifc.ifc_len = nalloc * sizeof(*ifr);
> 
>   rc = lnet_sock_ioctl(SIOCGIFCONF, (unsigned long));
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org



Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-05 Thread Oleg Drokin

On Dec 2, 2016, at 12:33 PM, Quentin Lambert wrote:

> lnet_ipif_enumerate was assigning a pointer from kernel space to user
> space. This patch uses copy_to_user to properly do that assignment.

I guess it's a false positive?

While lnet_sock_ioctl()->kernel_sock_unlocked_ioctl() does call into the
f_op->unlocked_ioctl() with a userspace argument, note that we have
set_fs(KERNEL_DS); in there, therefore allowig copy_from_user
and friends to work on kernel data too as if it was userspace.
(I know it's ugly and we need to find a better way of getting this data,
but at least it's not incorrect).

> 
> Signed-off-by: Quentin Lambert 
> ---
> shouldn't we be using ifc_req instead of ifc_buf?
> 
> drivers/staging/lustre/lnet/lnet/lib-socket.c |8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
> 
> --- a/drivers/staging/lustre/lnet/lnet/lib-socket.c
> +++ b/drivers/staging/lustre/lnet/lnet/lib-socket.c
> @@ -181,7 +181,13 @@ lnet_ipif_enumerate(char ***namesp)
>   goto out0;
>   }
> 
> - ifc.ifc_buf = (char *)ifr;
> + rc = copy_to_user(ifc.ifc_buf, (char *)ifr,
> +   nalloc * sizeof(*ifr));
> + if (rc) {
> + rc = -ENOMEM;
> + goto out1;
> + }
> +
>   ifc.ifc_len = nalloc * sizeof(*ifr);
> 
>   rc = lnet_sock_ioctl(SIOCGIFCONF, (unsigned long));
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org



Re: [PATCH] staging: lustre: Fix function declaration/definition mismatch

2016-12-05 Thread Oleg Drokin

On Dec 4, 2016, at 10:06 PM, <sandeepjain.li...@gmail.com> 
<sandeepjain.li...@gmail.com> wrote:

> From: Sandeep Jain <sandeepjain.li...@gmail.com>
> 
> Fixes following Sparse errors.
> lprocfs_status.c:1568:5: error: symbol 'lprocfs_wr_root_squash'
> redeclared with different type...
> lprocfs_status.c:1632:5: error: symbol 'lprocfs_wr_nosquash_nids'
> redeclared with different type...
> 
> Signed-off-by: Sandeep Jain <sandeepjain.li...@gmail.com>

Acked-by: Oleg Drokin <oleg.dro...@intel.com>

> ---
> drivers/staging/lustre/lustre/include/lprocfs_status.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h 
> b/drivers/staging/lustre/lustre/include/lprocfs_status.h
> index cc0713e..b5c24ca 100644
> --- a/drivers/staging/lustre/lustre/include/lprocfs_status.h
> +++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h
> @@ -701,9 +701,9 @@ static struct lustre_attr lustre_attr_##name = 
> __ATTR(name, mode, show, store)
> extern const struct sysfs_ops lustre_sysfs_ops;
> 
> struct root_squash_info;
> -int lprocfs_wr_root_squash(const char *buffer, unsigned long count,
> +int lprocfs_wr_root_squash(const char __user *buffer, unsigned long count,
>  struct root_squash_info *squash, char *name);
> -int lprocfs_wr_nosquash_nids(const char *buffer, unsigned long count,
> +int lprocfs_wr_nosquash_nids(const char __user *buffer, unsigned long count,
>struct root_squash_info *squash, char *name);
> 
> /* all quota proc functions */
> -- 
> 2.7.4



Re: [PATCH] staging: lustre: Fix function declaration/definition mismatch

2016-12-05 Thread Oleg Drokin

On Dec 4, 2016, at 10:06 PM,  
 wrote:

> From: Sandeep Jain 
> 
> Fixes following Sparse errors.
> lprocfs_status.c:1568:5: error: symbol 'lprocfs_wr_root_squash'
> redeclared with different type...
> lprocfs_status.c:1632:5: error: symbol 'lprocfs_wr_nosquash_nids'
> redeclared with different type...
> 
> Signed-off-by: Sandeep Jain 

Acked-by: Oleg Drokin 

> ---
> drivers/staging/lustre/lustre/include/lprocfs_status.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h 
> b/drivers/staging/lustre/lustre/include/lprocfs_status.h
> index cc0713e..b5c24ca 100644
> --- a/drivers/staging/lustre/lustre/include/lprocfs_status.h
> +++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h
> @@ -701,9 +701,9 @@ static struct lustre_attr lustre_attr_##name = 
> __ATTR(name, mode, show, store)
> extern const struct sysfs_ops lustre_sysfs_ops;
> 
> struct root_squash_info;
> -int lprocfs_wr_root_squash(const char *buffer, unsigned long count,
> +int lprocfs_wr_root_squash(const char __user *buffer, unsigned long count,
>  struct root_squash_info *squash, char *name);
> -int lprocfs_wr_nosquash_nids(const char *buffer, unsigned long count,
> +int lprocfs_wr_nosquash_nids(const char __user *buffer, unsigned long count,
>struct root_squash_info *squash, char *name);
> 
> /* all quota proc functions */
> -- 
> 2.7.4



  1   2   3   4   5   6   7   8   9   10   >