Re: [PATCH 3/3] exec: Transform exec_update_mutex into a rw_semaphore
On 12/4/20 9:10 PM, Linus Torvalds wrote: > On Fri, Dec 4, 2020 at 11:35 AM Eric W. Biederman > wrote: >> >> From a deadlock perspective the change is strictly better than what we >> have today. The readers will no longer block on each other. > > No, agreed, it's better regardless. > >> For the specific case that syzbot reported it is readers who were >> blocking on each other so that specific case if fixed. > > So the thing is, a reader can still block another reader if a writer > comes in between them. Which is why I was thinking that the same > deadlock could happen if somebody does an execve at just the right > point. > >> On the write side of exec_update_lock we have: >> >> cred_guard_mutex -> exec_update_lock >> >> Which means that to get an ABBA deadlock cred_guard_mutex would need to >> be involved > > No, see above: you can get a deadlock with > > - first reader gets exec_update_lock > > - writer on exec_update_lock blocks on first reader (this is exec) > > - second reader of exec_update_lock now blocks on the writer. > > So if that second reader holds something that the first one wants to > get (or is the same thread as the first one), you have a deadlock: the > first reader will never make any progress, will never release the > lock, and the writer will never get it, and the second reader will > forever wait for the writer that is ahead of it in the queue. > > cred_guard_mutex is immaterial, it's not involved in the deadlock. > Yes, the writer holds it, but it's not relevant for anything else. > > And that deadlock looks very much like what syzcaller detected, in > exactly that scenario: > > Chain exists of: > >exec_update_mutex --> sb_writers#4 --> >lock > >Possible unsafe locking scenario: > > CPU0CPU1 > > lock(>lock); > lock(sb_writers#4); > lock(>lock); > lock(>exec_update_mutex); > >*** DEADLOCK *** > > No? The only thing that is missing is that writer that causes the > exec_update_mutex readers to block each other - exactly like they did > when it was a mutex. > > But I may be missing something entirely obvious that keeps this from > happening. > I think you convinced me: >perf_event_open (exec_update_mutex -> ovl_i_mutex) >chown(ovl_i_mutex -> sb_writes) >sendfile (sb_writes -> p->lock) > by reading from a proc file and writing to overlayfs >proc_pid_syscall (p->lock -> exec_update_mutex) We need just 5 Philosophers (A-E): Philosopher A calls proc_pid_syscall, takes p->lock, and falls asleep for a while now. Philosphper B calls sendfile, takes sb_writes, and has to wait on p->lock. Philosopher C calls chown, takes ovl_i_mutes, and has to wait on sb_writes. Philosopher D calls perf_event_open, takes down_read(exec_mutex_lock), and has to wait on ovl_i_mutex. Philosopher E calls exec, and wants to take down_write(exec_mutex_lock), has to wait for Philosopher D. Now Philosopher A wakes up from his sleep, and wants to take down_read(exec_mutex_lock), but due to fairness reasons, he has to wait until Philosopher E gets and releases his write lock again. If Philosopher A is now blocked, we have a deadlock, right? Bernd. > Linus >
Re: [PATCH 3/3] exec: Transform exec_update_mutex into a rw_semaphore
Hi Eric, I think I remembered from a previous discussion about this topic, that it was unclear if the rw_semaphores are working the same in RT-Linux. Will this fix work in RT as well? On 12/3/20 9:12 PM, Eric W. Biederman wrote: > --- a/kernel/kcmp.c > +++ b/kernel/kcmp.c > @@ -70,25 +70,25 @@ get_file_raw_ptr(struct task_struct *task, unsigned int > idx) > return file; > } > > -static void kcmp_unlock(struct mutex *m1, struct mutex *m2) > +static void kcmp_unlock(struct rw_semaphore *l1, struct rw_semaphore *l2) > { > - if (likely(m2 != m1)) > - mutex_unlock(m2); > - mutex_unlock(m1); > + if (likely(l2 != l1)) is this still necessary ? > + up_read(l2); > + up_read(l1); > } > > -static int kcmp_lock(struct mutex *m1, struct mutex *m2) > +static int kcmp_lock(struct rw_semaphore *l1, struct rw_semaphore *l2) > { > int err; > > - if (m2 > m1) > - swap(m1, m2); > + if (l2 > l1) > + swap(l1, l2); and this is probably also no longer necessary? > > - err = mutex_lock_killable(m1); > - if (!err && likely(m1 != m2)) { > - err = mutex_lock_killable_nested(m2, SINGLE_DEPTH_NESTING); > + err = down_read_killable(l1); > + if (!err && likely(l1 != l2)) { and this can now be unconditionally, right? > + err = down_read_killable_nested(l2, SINGLE_DEPTH_NESTING); > if (err) > - mutex_unlock(m1); > + up_read(l1); > } > > return err; > @@ -156,8 +156,8 @@ SYSCALL_DEFINE5(kcmp, pid_t, pid1, pid_t, pid2, int, type, > /* >* One should have enough rights to inspect task details. >*/ > - ret = kcmp_lock(>signal->exec_update_mutex, > - >signal->exec_update_mutex); > + ret = kcmp_lock(>signal->exec_update_lock, > + >signal->exec_update_lock); > if (ret) > goto err; > if (!ptrace_may_access(task1, PTRACE_MODE_READ_REALCREDS) || > @@ -212,8 +212,8 @@ SYSCALL_DEFINE5(kcmp, pid_t, pid1, pid_t, pid2, int, type, > } > > err_unlock: > - kcmp_unlock(>signal->exec_update_mutex, > - >signal->exec_update_mutex); > + kcmp_unlock(>signal->exec_update_lock, > + >signal->exec_update_lock); > err: > put_task_struct(task1); > put_task_struct(task2); Thanks Bernd.
Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
On 4/30/20 6:40 PM, Linus Torvalds wrote: > On Thu, Apr 30, 2020 at 7:29 AM Bernd Edlinger > wrote: >> >> Ah, now I see, that was of course not the intended effect, >> but that is not where the pseudo-deadlock happens at all, >> would returning -RESTARTNOINTR in this function make this >> patch acceptable, it will not have an effect on the test case? > > So that was why I suggested doing it all with a helper function, and > also doing that > > set_thread_flag(TIF_SIGPENDING); > > because without going through the "check-for-signals" code at return > to user space, -ERESTARTNOINTR doesn't actually _do_ any restart. > > However, the more I looked at it, the less I actually liked that hack. > > Part of it is simply because it can cause the exact same problem that > ptrace() does (at least in theory). And even if you don't get the > livelock thing, you can get the "use 100% CPU time" thing, because if > that case ever triggers, and we re-try, it will generally just _keep_ > on triggering (think "execve is waiting for a zombie, nobody is > reaping it"). > > IOW, restarting doesn't really fix the problem, or guarantee any > forward progress. > Right, if it is a real time process it will result in priority-inversion. Correct. If it is a virus checker it will be real time priority and it will be very aggressive ;-) I can feel its aggressiveness already :-) shiver... And this little zombie-process will paralyze it immediately, nice try. You see what I mean? > So I'd have been ok with your "unsafe_exec_flag" if > > (a) it had been done in one place with a helper function. > > (b) it would _only_ trigger for ptrace (and perhaps seccomp). > > but I don't think it works for that write() case. > > That said, I'm not 100% convinced that that write() case really even > needs that cred_guard_mutex (renamed or not). > > Maybe we can introduce a new mutex just against concurrent ptrace > (which is what at least the _comment_ says_ that > security_setprocattr() wants - I didn't check the actual low-level > security code). > > So maybe that proc_pid_attr_write() case could be done some other way > entirely. > > Th emore we go through all this, the more I really think that Oleg's > patch to just delay the waiting for things until after dropping the > mutex in execve() is the way to go. > > Is it a "simple" and small patch? No. But it really addresses the core > issue, without introducing new odd rules or special cases, or making a > lock that doesn't reliably work as a lock. > Hmm. I think I can agree, that this problem deserves to be solved really slowly. Oleg where was your last patch, does it still work or does it need to be re-based? And I almost forgot about Eric, are you still with us? Thanks Bernd.
Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
Hi Linus, On 4/30/20 3:47 PM, Linus Torvalds wrote: > On Thu, Apr 30, 2020 at 6:39 AM Bernd Edlinger > wrote: >> >> Excuse me, but what in my /proc folder there is no attr/something >> is there a procfs equivalent of pthread_attach ? >> >> What exactly is "attr/something" ? > > Anything that uses that proc_pid_attr_write(). > > Which you should have realized, since you wrote the patch that changed > that function to return -EAGAIN. > Ah, now I see, that was of course not the intended effect, but that is not where the pseudo-deadlock happens at all, would returning -RESTARTNOINTR in this function make this patch acceptable, it will not have an effect on the test case? Bernd. > That's > > /proc//attr/{current,exec,fscreate,keycreate,prev,sockcreate} > > and some smack files. > > Your patch definitely made them return -EINVAL if they happen in that > execve() black hole, instead of waiting for the execve() to just > complete and then just work. > > Dropping a lock really is broken. It';s broken even if you then set a > flag saying "I dropped the lock, now you can't use it". > > Linus >
Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
On 4/30/20 4:16 AM, Linus Torvalds wrote: > On Wed, Apr 29, 2020 at 5:00 PM Jann Horn wrote: >> >> Wouldn't you end up livelocked in the scenario that currently deadlocks? > > The test case that we already know is broken, and any fix will have to > change anyway? > The purpose of the test case was only to test the behaviour of my later patch. The test case _must_ be adjusted to the follow-up patch, I have no problem with that. Anybody may change the test case when we know how to fix the API. I did just not anticipate that Eric would only apply 14 of 16 patches = 87.5% of the patch series. Now that causes some tension, but it is not really a problem IMHO. > Let's just say that I don't care in the least. > > But Bernd's patch as-is breaks a test-case that currently *works*, > namely something as simple as > > echo xyz > /proc//attr/something > Excuse me, but what in my /proc folder there is no attr/something is there a procfs equivalent of pthread_attach ? What exactly is "attr/something" ? > and honestly, breaking something that _works_ and may be used in > reality, in orderf to make a known buggy user testcase work? > > Because no, "write()" returning -EAGAIN isn't ok. > write can return -EAGAIN if the file is non-blocking, it is never the case for a disk file, but for a NFS that is not at all clear, depends on a mount option, and once I had a deadlock in one of my test systems after OOM-stress, but I never was able to reproduce, the umount deadlocked, then I was not able to reboot, could be an alpha-particle of course, who knows... Hmmm.. maybe a stupid idea: We could keep the old deadlock-capable API, and add a new _flag_ somewhere to the PTHREAD_ATTACH call, that _enables_ the non-blocking behavior, how about that. Thanks Bernd. > Linus >
Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
On 4/30/20 1:59 AM, Jann Horn wrote: > On Thu, Apr 30, 2020 at 1:22 AM Linus Torvalds > wrote: >> On Wed, Apr 29, 2020 at 3:38 PM Linus Torvalds >> wrote: >>> >>> If you do it properly, with a helper function instead of repeating >>> that fragile nasty thing, maybe it will look better to me. I added the BIG FAT WARNNIG comments as a mitigation for that. Did you like those comments? >> >> Side note: if it has a special helper function for the "get lock, >> repeat if it was invalid", you can do a better job than return >> -EAGAIN. >> >> In particular, you can do this >> >> set_thread_flag(TIF_SIGPENDING); >> return -RESTARTNOINTR; >> >> which will actually restart the system call. So a ptrace() user (or >> somebody doing a "write()" to /proc//attr/xyz, wouldn't even see >> the impossible EAGAIN error. > > Wouldn't you end up livelocked in the scenario that currently deadlocks? Like: > > - tracer attaches to thread A > - thread B goes into execve, blocks on waiting for A's death > - tracer tries to attach to B and hits the -EAGAIN > > If we make the PTRACE_ATTACH call restart, the tracer will just end up > looping without ever resolving the deadlock. If we want to get through > this cleanly with this approach, userspace needs to either > deprioritize the "I want to attach to pid X" and go back into its > eventloop, or to just treat -EAGAIN as a fatal error and give up > trying to attach to that task. > Yes, exactly, the point is the caller is expected to call wait in that scenario, otherwise the -EAGAIN just repeats forever, that is an API change, yes, but something unavoidable, and the patch tries hard to limit it to cases where the live-lock or pseudo-dead-lock is unavoidable anyway. Bernd.
Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
On 4/29/20 9:26 PM, Jann Horn wrote: > On Wed, Apr 29, 2020 at 9:23 PM Bernd Edlinger > wrote: >> On 4/29/20 7:58 PM, Linus Torvalds wrote: >>> On Tue, Apr 28, 2020 at 4:36 PM Jann Horn wrote: >>>> >>>> On Wed, Apr 29, 2020 at 12:14 AM Linus Torvalds >>>> wrote: >>>>> >>>>> - we move check_unsafe_exec() down. As far as I can tell, there's no >>>>> reason it's that early - the flags it sets aren't actually used until >>>>> when we actually do that final set_creds.. >>>> >>>> Right, we should be able to do that stuff quite a bit later than it >>>> happens now. >>> >>> Actually, looking at it, this looks painful for multiple reasons. >>> >>> The LSM_UNSAFE_xyz flags are used by security_bprm_set_creds(), which >>> when I traced it through, happened much earlier than I thought. Making >>> things worse, it's done by prepare_binprm(), which also potentially >>> gets called from random points by the low-level binfmt handlers too. >>> >>> And we also have that odd "fs->in_exec" flag, which is used by thread >>> cloning and io_uring, and I'm not sure what the exact semantics are. >>> >>> I'm _almost_ inclined to say that we should just abort the execve() >>> entirely if somebody tries to attach in the middle. >>> >>> IOW, get rid of the locking, and replace it all just with a sequence >>> count. Make execve() abort if the sequence count has changed between >>> loading the original creds, and having installed the new creds. >>> >>> You can ptrace _over_ an execve, and you can ptrace _after_ an >>> execve(), but trying to attach just as we execve() would just cause >>> the execve() to fail. >>> >>> We could maybe make it conditional on the credentials actually having >>> changed at all (set another flag in bprm_fill_uid()). So it would only >>> fail for the suid exec case. >>> >>> Because honestly, trying to ptrace in the middle of a suid execve() >>> sounds like an attack, not a useful thing. >>> >> >> I think the use case where a program attaches and detaches many >> processes at a high rate, is either an attack or a very aggressive >> virus checker, fixing a bug that prevents an attack is not a good >> idea, but fixing a bug that would otherwise break a virus checker >> would be a good thing. >> >> By the way, all other attempts to fix it look much more dangerous >> than my initially proposed patch, you know the one you hated, but >> it does work and does not look overly complicated either. >> >> What was the reason why that cannot be done this way? > > I'm not sure which patch you're talking about - I assume you don't > mean > <https://lore.kernel.org/lkml/am6pr03mb5170b06f3a2b75efb98d071ae4...@am6pr03mb5170.eurprd03.prod.outlook.com/>? > No, I meant: [PATCH v7 15/16] exec: Fix dead-lock in de_thread with ptrace_attach https://marc.info/?l=linux-kernel=158559277631548=2 and [PATCH v6 16/16] doc: Update documentation of ->exec_*_mutex https://marc.info/?l=linux-kernel=158559277631548=2 I think that was the latest version, but this had several iterations already. Thanks Bernd.
Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
On 4/29/20 7:58 PM, Linus Torvalds wrote: > On Tue, Apr 28, 2020 at 4:36 PM Jann Horn wrote: >> >> On Wed, Apr 29, 2020 at 12:14 AM Linus Torvalds >> wrote: >>> >>> - we move check_unsafe_exec() down. As far as I can tell, there's no >>> reason it's that early - the flags it sets aren't actually used until >>> when we actually do that final set_creds.. >> >> Right, we should be able to do that stuff quite a bit later than it happens >> now. > > Actually, looking at it, this looks painful for multiple reasons. > > The LSM_UNSAFE_xyz flags are used by security_bprm_set_creds(), which > when I traced it through, happened much earlier than I thought. Making > things worse, it's done by prepare_binprm(), which also potentially > gets called from random points by the low-level binfmt handlers too. > > And we also have that odd "fs->in_exec" flag, which is used by thread > cloning and io_uring, and I'm not sure what the exact semantics are. > > I'm _almost_ inclined to say that we should just abort the execve() > entirely if somebody tries to attach in the middle. > > IOW, get rid of the locking, and replace it all just with a sequence > count. Make execve() abort if the sequence count has changed between > loading the original creds, and having installed the new creds. > > You can ptrace _over_ an execve, and you can ptrace _after_ an > execve(), but trying to attach just as we execve() would just cause > the execve() to fail. > > We could maybe make it conditional on the credentials actually having > changed at all (set another flag in bprm_fill_uid()). So it would only > fail for the suid exec case. > > Because honestly, trying to ptrace in the middle of a suid execve() > sounds like an attack, not a useful thing. > I think the use case where a program attaches and detaches many processes at a high rate, is either an attack or a very aggressive virus checker, fixing a bug that prevents an attack is not a good idea, but fixing a bug that would otherwise break a virus checker would be a good thing. By the way, all other attempts to fix it look much more dangerous than my initially proposed patch, you know the one you hated, but it does work and does not look overly complicated either. What was the reason why that cannot be done this way? Thanks, Bernd.
Re: [PATCHv5] random: Make /dev/random wait for input_pool initializedy
On 2/22/19 12:18 AM, Theodore Y. Ts'o wrote: > The whole premise of reading from /dev/random is that it should only > allow reads up to available estimated entropy. I'm assuming here that > sane users of /dev/random will be reading in chunks of at least 128 > bits, reading smaller amounts for, say, a 32-bit SPECK key, is not > someone who is paranoid enough to want to use /dev/random is likely to > want to do. So what I was doing was to simply prevent any reads from > /dev/random until it had accumulated 128 bits worth of entropy. If > the user is reading 128 bits in order to generate a 128-bit session > key, this won't actually slow down /dev/random any more that it does > today. > > It will slow down someone who just wants to read a byte from > /dev/random immediately after boot --- but as far as I'm concerned, > that's crazy, so I don't really care about optimizing it. Your > suggestion of simply not allowing any reads until the CRNG is > initialized, and then injecting 128-bits into the blocking pool would > also work, but it wouldn't speed up the use case of "the user is > trying to read 128 bits from /dev/random". It only speeds up "read 1 > byte from /dev/random". > > Personally, I would generally be simply tell users, "use getrandom(2) > and be happy", and I consider /dev/random to be a legacy interface. > It's just that there are some crazy customers who seem to believe that > /dev/random is required for FIPS compliance. > Sure. > So optimizing for users who want to read vast amount of data from > /dev/random is a non-goal as far as I am concerned. In particular, > seeding the CRNG and keeping it properly reseeded is higher priority > as far as I'm concerned. If that slows down /dev/random a bit, > /dev/random is *always* going to be slow. > There are rumors that reading one byte from /dev/random in the rcS script makes /dev/urandom properly seeded. That sounds logical, but does not work as expected due to a bug that I am trying to fix. >>> - struct entropy_store *other = _pool; >>> - >>> - if (other->entropy_count <= >>> - 3 * other->poolinfo->poolfracbits / 4) { >>> - schedule_work(>push_work); >>> - r->entropy_total = 0; >>> - } >>> - } >>> + if (!work_pending(>push_work) && >>> + (ENTROPY_BITS(r) > 6 * r->poolinfo->poolbytes) && >>> + (ENTROPY_BITS(other) <= 6 * other->poolinfo->poolbytes)) >>> + schedule_work(>push_work); >> >> push_to_pool will transfer chunks of random_read_wakeup_bits. >> >> I think push_to_pool should also match this change. > > I was trying to keep the size of this patch as small as practical, > since the primary goal was to improve the security of the bits > returned when reading the a 128 bit of randomness immediately after > boot. > I definitely share this desire with you. But there are also issues with the CRNG initialization, it is very important to me not to make those worse. >> I like it that this path is controllable via random_write_wakeup_bits, >> that would be lost with this change. > > Very few people actually use these knobs, and in fact I regret making > them available, since changing these to insane values can impact the > security properties of /dev/random. I don't actually see a good > reason why a user *would* want to adjust the behavior of this code > path, and it makes it much simpler to reason about how this code path > works if we don't make it controllable by the user. > >>> @@ -1553,6 +1548,11 @@ static ssize_t extract_entropy_user(struct >>> entropy_store *r, void __user *buf, >>> int large_request = (nbytes > 256); >>> >>> trace_extract_entropy_user(r->name, nbytes, ENTROPY_BITS(r), _RET_IP_); >>> + if (!r->initialized && r->pull) { >>> + xfer_secondary_pool(r, ENTROPY_BITS(r->pull)/8); >>> + if (!r->initialized) >>> + return 0; >> >> Did you want to do that in push_to_pool? > > No, this was deliberate. The point here is that if the blocking pool > is not initialized (which is defined as having accumulated 128 bits of > entropy once), we refuse to return any entropy at all. > Somehow the entropy is not returned, but still transferred from the input_pool to the blocking_pool, which prevents the initialization of the CRNG. And finally the blocking_pool is initialized, but not the CRNG. See my tests below. >> The second part of the _random_read does not match this change: >> >> wait_event_interruptible(random_read_wait, >> ENTROPY_BITS(_pool) >= >> random_read_wakeup_bits); >> if (signal_pending(current)) >> return -ERESTARTSYS; >> >> >> and will go into a busy loop, when >> ENTROPY_BITS(_pool) >= random_read_wakeup_bits, right? > > No, because what's being tested is the
Re: [PATCHv5] random: Make /dev/random wait for input_pool initializedy
Hi Theodore, On 2/21/19 1:32 AM, Theodore Y. Ts'o wrote: > Hi Brend, > > I've been looking at your patch, and as far as the core part of what > you're patching, I think this patch below is a conceptually simpler > way of fixing the original core problem. Please take a look and let > me know what you think. > > There are some other auxiliary things that your patch is trying to do > where I'm not sure what you're trying to doing and I'm not at all sure > it's correct. Those things should really get separated out as a > separate patches. > Yes, that is true. It just got more each time I looked at the code. Especially the jiffies wrap-around issue is an independent thing. >> Another minor tweak this patch makes, is when the primary >> CRNG is periodically reseeded, we reserve twice the amount >> of read_wakeup_threshold for fairness reasons, to keep >> /dev/random readable when it is not accessed by user mode. > > I'm not sure what you're trying to do/fix here. > It is the select function that is not working. What I try to fix is that you can select on /dev/random for readable, and that it guarantees that at least one process will be able to read at least one byte out of the device, and if that is not done, it should stay in that state. Additionally if the first byte is read out of /dev/random or /dev/random is selected for readable, that should ensure that /dev/urandom is also fully initialized. Additionally I want to achieve that the CRNG is initialized as fast as possible and once that is done, the /dev/random is working as soon as possible. And that the re-seeding of the CRNG is not creating race conditions which invalidate the readability guarantee from the select. Currently on a low entropy system the entroy_available does count from 0 to 128, where CRNG initializes then again from 0 to 128..256, where CRNG reseeds then again from 0 to 128..256, where CRNG reseeds then again... so each time the entropy_available is in the range 0..63 select indicates !ready and then again ready, so it toggles all the time. That is what I wanted to fix mainly. > - Ted > > From 9275727af22bc08958bad45d604038aa8e9b6766 Mon Sep 17 00:00:00 2001 > From: Theodore Ts'o > Date: Wed, 20 Feb 2019 16:06:38 -0500 > Subject: [PATCH] random: only read from /dev/random after its pool has > received 128 bits > > Immediately after boot, we allow reads from /dev/random before its > entropy pool has been fully initialized. Fix this so that we don't > allow this until the blocking pool has received 128 bits. > My observation, with a previous attempt (v3) of my patch is that a system where only interrupt randomness is available it takes typically 2 minutes to accomplish the initial seeding of the CRNG, if from there one has to wait until there is 128 bit entropy in the blocking pool it will take much too long. The patch was actually buggy and did only ensure 80 bits of entropy in the blocking pool but even that did take 6.5 minutes, which felt to me like an absolutely unacceptable waiting time. Therefore I would like to keep that to 2-3 minutes that the CRNG takes for it's initialization. From there I thought, even if the entroy in the input_pool drops to zero, the information content however is still there, and after adding the next 64 bits if raw entropy, it would be fine to extract 8 bytes form the input_pool and feed that to the blocking pool, that would make 6 bytes of output, which should be completely unpredictable, the approach taken in my v5 patch. If that is not good enough, maybe extract 128 bits from the urandom and inject that into the blocking pool without incrementing the blocking_pool's entropy count. > We do this by repurposing the initialized flag in the entropy pool > struct, and use the initialized flag in the blocking pool to indicate > whether it is safe to pull from the blocking pool. > > To do this, we needed to rework when we decide to push entropy from the > input pool to the blocking pool, since the initialized flag for the > input pool was used for this purpose. To simplify things, we no > longer use the initialized flag for that purpose, nor do we use the > entropy_total field any more. > > Signed-off-by: Theodore Ts'o > --- > drivers/char/random.c | 44 +-- > include/trace/events/random.h | 13 --- > 2 files changed, 27 insertions(+), 30 deletions(-) > > diff --git a/drivers/char/random.c b/drivers/char/random.c > index 47ac7cd20fb1..e247c45b2772 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -470,7 +470,6 @@ struct entropy_store { > unsigned short add_ptr; > unsigned short input_rotate; > int entropy_count; > - int entropy_total; > unsigned int initialized:1; > unsigned int last_data_init:1; > __u8 last_data[EXTRACT_SIZE]; > @@ -643,7 +642,7 @@ static void process_random_ready_list(void) > */ > static void
[PATCHv5] random: Make /dev/random wait for input_pool initialized
Reading from /dev/random may return data while the getrandom syscall is still blocking. Those bytes are not yet cryptographically secure. The first byte from /dev/random can have as little as 8 bits entropy estimation. Once a read blocks, it will block until /proc/sys/kernel/random/read_wakeup_threshold bits are available, which is usually 64 bits, but can be configured as low as 8 bits. A select will wake up when at least read_wakeup_threshold bits are available. Also when constantly reading bytes out of /dev/random it will prevent the crng init done event forever. Furthermore previously the input_pool got initialized when more than 128 bits of raw entropy are accumulated, but that is only credited 80 bits of pool entroy. Therefore if random_write_wakeup_bits is configured lower than 80 bits, the code path that sends entropy to the blocking_pool can get activated, before the CRNG is initialized and delays the initialization of the CRNG until the blocking_pool is filled to 75%. Fixed by making read and select on /dev/random wait until the input_pool is initialized which means that more than 128 bits of entropy have been fed into the input_pool. This is guaranteed to happen after the CRNG is ready. So after the first byte is readable from /dev/random also /dev/urandom is guaranteed to be fully initialized. Another minor tweak this patch makes, is when the primary CRNG is periodically reseeded, we reserve twice the amount of read_wakeup_threshold for fairness reasons, to keep /dev/random readable when it is not accessed by user mode. And finally, when the system runs for long times the jiffies may roll over, but crng_global_init_time is not updated when reseed every 5 minutes happens. This could cause CRNG reseeding all the time, making the input_pool run low on entropy which would make /dev/random block unexpectedly. Signed-off-by: Bernd Edlinger --- v4 makes the /dev/random block until the input_pool has reached 128 bits of entropy at least once. Now make everything depend on input_pool.initialized. Additionally fixed a potential issue with jiffies roll over in the CRNG reseeding logic, which would cause the cgng_global_init_time to be in the future, causing reseeding to happen all the time. Finally added a fairness reserve to keep the /dev/random in the readable state (select can check non-destructively), when nothing but CRNG reseeding happens. v5 adds one "&& crng_ready()" to fix a race condition. --- drivers/char/random.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index bd449ad..652e024 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -645,6 +645,7 @@ static void process_random_ready_list(void) static void credit_entropy_bits(struct entropy_store *r, int nbits) { int entropy_count, orig; + int entropy_bits; const int pool_size = r->poolinfo->poolfracbits; int nfrac = nbits << ENTROPY_SHIFT; @@ -702,8 +703,9 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) if (cmpxchg(>entropy_count, orig, entropy_count) != orig) goto retry; + entropy_bits = entropy_count >> ENTROPY_SHIFT; r->entropy_total += nbits; - if (!r->initialized && r->entropy_total > 128) { + if (!r->initialized && entropy_bits >= 128) { r->initialized = 1; r->entropy_total = 0; } @@ -713,8 +715,6 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) r->entropy_total, _RET_IP_); if (r == _pool) { - int entropy_bits = entropy_count >> ENTROPY_SHIFT; - if (crng_init < 2 && entropy_bits >= 128) { crng_reseed(_crng, r); entropy_bits = r->entropy_count >> ENTROPY_SHIFT; @@ -722,10 +722,12 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) /* should we wake readers? */ if (entropy_bits >= random_read_wakeup_bits && + r->initialized && wq_has_sleeper(_read_wait)) { wake_up_interruptible(_read_wait); kill_fasync(, SIGIO, POLL_IN); } + /* If the input pool is getting full, send some * entropy to the blocking pool until it is 75% full. */ @@ -917,9 +919,11 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r) } buf; if (r) { - num = extract_entropy(r, , 32, 16, 0); + num = extract_entropy(r, , 32, 16, crng_ready() + ? random_read_wakeup_bits / 4 : 0); if (num == 0)
Re: [PATCHv4] random: Make /dev/random wait for input_pool initialized
> @@ -1826,7 +1830,9 @@ _random_read(int nonblock, char __user *buf, size_t > nbytes) > > nbytes = min_t(size_t, nbytes, SEC_XFER_SIZE); > while (1) { > - n = extract_entropy_user(_pool, buf, nbytes); > + n = input_pool.initialized > + ? extract_entropy_user(_pool, buf, nbytes) Aehm, sorry, now I see this creates a race condition with this code here, since this the crng_reseed here also tries to read from the input_pool, but input_pool.initialized is already true: if (crng_init < 2 && entropy_bits >= 128) { crng_reseed(_crng, r); entropy_bits = r->entropy_count >> ENTROPY_SHIFT; I was able to get a system in this behavior by running 3 instances of #include #include #include int main() { int f = open("/dev/random", O_NDELAY); if (f<0) return 1; for(;;) { unsigned char buf[16]; int x = read(f, buf, sizeof(buf)); if (x>=0) { int i; printf("read %d bytes: ", x); for (i=0; i + n = input_pool.initialized && crng_ready() > + ? extract_entropy_user(_pool, buf, nbytes) Thanks (for your patience :-) Bernd.
[PATCHv4] random: Make /dev/random wait for input_pool initialized
Reading from /dev/random may return data while the getrandom syscall is still blocking. Those bytes are not yet cryptographically secure. The first byte from /dev/random can have as little as 8 bits entropy estimation. Once a read blocks, it will block until /proc/sys/kernel/random/read_wakeup_threshold bits are available, which is usually 64 bits, but can be configured as low as 8 bits. A select will wake up when at least read_wakeup_threshold bits are available. Also when constantly reading bytes out of /dev/random it will prevent the crng init done event forever. Furthermore previously the input_pool got initialized when more than 128 bits of raw entropy are accumulated, but that is only credited 80 bits of pool entroy. Therefore if random_write_wakeup_bits is configured lower than 80 bits, the code path that sends entropy to the blocking_pool can get activated, before the CRNG is initialized and delays the initialization of the CRNG until the blocking_pool is filled to 75%. Fixed by making read and select on /dev/random wait until the input_pool is initialized which means that more than 128 bits of entropy have been fed into the input_pool. This is guaranteed to happen after the CRNG is ready. So after the first byte is readable from /dev/random also /dev/urandom is guaranteed to be fully initialized. Another minor tweak this patch makes, is when the primary CRNG is periodically reseeded, we reserve twice the amount of read_wakeup_threshold for fairness reasons, to keep /dev/random readable when it is not accessed by user mode. And finally, when the system runs for long times the jiffies may roll over, but crng_global_init_time is not updated when reseed every 5 minutes happens. This could cause CRNG reseeding all the time, making the input_pool run low on entropy which would make /dev/random block unexpectedly. Signed-off-by: Bernd Edlinger --- v4 makes the /dev/random block until the input_pool has reached 128 bits of entropy at least once. Now make everything depend on input_pool.initialized. Additionally fixed a potential issue with jiffies roll over in the CRNG reseeding logic, which would cause the cgng_global_init_time to be in the future, causing reseeding to happen all the time. Finally added a fairness reserve to keep the /dev/random in the readable state (select can check non-destructively), when nothing but CRNG reseeding happens. --- drivers/char/random.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index bd449ad..97cde01 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -645,6 +645,7 @@ static void process_random_ready_list(void) static void credit_entropy_bits(struct entropy_store *r, int nbits) { int entropy_count, orig; + int entropy_bits; const int pool_size = r->poolinfo->poolfracbits; int nfrac = nbits << ENTROPY_SHIFT; @@ -702,8 +703,9 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) if (cmpxchg(>entropy_count, orig, entropy_count) != orig) goto retry; + entropy_bits = entropy_count >> ENTROPY_SHIFT; r->entropy_total += nbits; - if (!r->initialized && r->entropy_total > 128) { + if (!r->initialized && entropy_bits >= 128) { r->initialized = 1; r->entropy_total = 0; } @@ -713,8 +715,6 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) r->entropy_total, _RET_IP_); if (r == _pool) { - int entropy_bits = entropy_count >> ENTROPY_SHIFT; - if (crng_init < 2 && entropy_bits >= 128) { crng_reseed(_crng, r); entropy_bits = r->entropy_count >> ENTROPY_SHIFT; @@ -722,10 +722,12 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) /* should we wake readers? */ if (entropy_bits >= random_read_wakeup_bits && + r->initialized && wq_has_sleeper(_read_wait)) { wake_up_interruptible(_read_wait); kill_fasync(, SIGIO, POLL_IN); } + /* If the input pool is getting full, send some * entropy to the blocking pool until it is 75% full. */ @@ -917,9 +919,11 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r) } buf; if (r) { - num = extract_entropy(r, , 32, 16, 0); + num = extract_entropy(r, , 32, 16, crng_ready() + ? random_read_wakeup_bits / 4 : 0); if (num == 0) return; + crng_global_init_time = jiffies; } else {
Re: [PATCHv3] random: Make /dev/random wait for crng_ready
On 2/17/19 9:44 AM, Bernd Edlinger wrote: > > + if (crng_ready() && !blocking_pool.initialized && After some more debugging I realize that blocking_pool.initialized is true after 128 bits of input entropy, but that is only 80 bits credited, due to the asymptotic 3/4 crediting formula. I see that will also enable the code path below: if (entropy_bits > random_write_wakeup_bits && r->initialized && r->entropy_total >= 2*random_read_wakeup_bits) { struct entropy_store *other = _pool; if (other->entropy_count <= 3 * other->poolinfo->poolfracbits / 4) { schedule_work(>push_work); r->entropy_total = 0; } when random_write_wakeup_bits is below 80, and random_read_wakeup_bits is also smallish. This depletes the input_pool in favor of the blocking pool, while we are actually waiting for the input_pool to reach 128 bits security strength, in order to seed the CRNG. I am testing a new version and will post it later today. Sorry for all the back-and forth. Bernd.
[PATCHv3] random: Make /dev/random wait for crng_ready
Reading from /dev/random may return data while the getrandom syscall is still blocking. Those bytes are not yet cryptographically secure. The first byte from /dev/random can have as little as 8 bits entropy estimation. Once a read blocks, it will block until /proc/sys/kernel/random/read_wakeup_threshold bits are available, which is usually 64 bits, but can be configured as low as 8 bits. A select will wake up when at least read_wakeup_threshold bits are available. Also when constantly reading bytes out of /dev/random it will prevent the crng init done event forever. Fixed by making read and select on /dev/random wait until the crng is fully initialized and the blocking_pool is also initialized which means that more than 128 bits of entopy have been accumulated in the blocking_pool. Signed-off-by: Bernd Edlinger --- The v3 version waits much longer than the v2 version, since first 128 bits from the input_pool go into the CRNG, the next 64 bits are only accounted 3/4 = 48 bits in the blocking_pool, so we need in total 192 bits from the input_pool until the blocking_pool is initialized And another 64 bits until a select on /dev/random wakes up. On a system with only interrupt_randomness, this takes 128+192+64=384 random bits, about 6.5 minutes from boot until /dev/random is readable. Maybe this is taking too long, after the CRNG ready? After the input_pool had already been initialized, I wonder if feeding the next 64 bits from the input pool to the empty blocking_pool could already be considered to be good enough to derive the first random byte from the blocking_pool? Thanks Bernd. --- drivers/char/random.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 38c6d1a..666102d 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -476,6 +476,7 @@ struct entropy_store { __u8 last_data[EXTRACT_SIZE]; }; +static void _xfer_secondary_pool(struct entropy_store *r, size_t nbytes); static ssize_t extract_entropy(struct entropy_store *r, void *buf, size_t nbytes, int min, int rsvd); static ssize_t _extract_entropy(struct entropy_store *r, void *buf, @@ -719,8 +720,16 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) entropy_bits = r->entropy_count >> ENTROPY_SHIFT; } + if (crng_ready() && !blocking_pool.initialized && + entropy_bits >= random_read_wakeup_bits) { + _xfer_secondary_pool(_pool, entropy_bits / 8); + r->entropy_total = 0; + entropy_bits = r->entropy_count >> ENTROPY_SHIFT; + } + /* should we wake readers? */ - if (entropy_bits >= random_read_wakeup_bits && + if (blocking_pool.initialized && + entropy_bits >= random_read_wakeup_bits && wq_has_sleeper(_read_wait)) { wake_up_interruptible(_read_wait); kill_fasync(, SIGIO, POLL_IN); @@ -1317,7 +1326,6 @@ void add_disk_randomness(struct gendisk *disk) * from the primary pool to the secondary extraction pool. We make * sure we pull enough for a 'catastrophic reseed'. */ -static void _xfer_secondary_pool(struct entropy_store *r, size_t nbytes); static void xfer_secondary_pool(struct entropy_store *r, size_t nbytes) { if (!r->pull || @@ -1661,7 +1669,7 @@ void get_random_bytes(void *buf, int nbytes) */ int wait_for_random_bytes(void) { - if (likely(crng_ready())) + if (crng_ready()) return 0; return wait_event_interruptible(crng_init_wait, crng_ready()); } @@ -1851,7 +1859,9 @@ void rand_initialize_disk(struct gendisk *disk) nbytes = min_t(size_t, nbytes, SEC_XFER_SIZE); while (1) { - n = extract_entropy_user(_pool, buf, nbytes); + n = blocking_pool.initialized + ? extract_entropy_user(_pool, buf, nbytes) + : 0; if (n < 0) return n; trace_random_read(n*8, (nbytes-n)*8, @@ -1865,6 +1875,7 @@ void rand_initialize_disk(struct gendisk *disk) return -EAGAIN; wait_event_interruptible(random_read_wait, + blocking_pool.initialized && ENTROPY_BITS(_pool) >= random_read_wakeup_bits); if (signal_pending(current)) @@ -1909,7 +1920,8 @@ void rand_initialize_disk(struct gendisk *disk) poll_wait(file, _read_wait, wait); poll_wait(file, _write_wait, wait); mask = 0; - if (ENTROPY_BITS(_pool) >= random_read_wakeup_bits) + if (blocking_pool.initialized &&am
Re: [PATCHv2] random: Make /dev/random wait for crng_ready
On 2/16/19 7:23 PM, Theodore Y. Ts'o wrote: > > This really isn't a correct way to fix things; since the blocking_pool > used for /dev/random and the CRNG state are different things, and are > fed by different sources of entropy. > It is interesting that random_poll does only look at input_pool. Could it be that the existing entropy in blocking_pool also matters? My observation is that _random_read extracts always the requested number of bytes from the input_pool by calling extract_entropy_user. By not calling extract_entropy_user it is straight forward to ensure that the entropy in the input_pool can accumulate until the primary_crng is being initialized. > What we should do is to have a separate flag which indicates that the > blocking_pool has been adequately initialized, and set it only when > the entropy count in the blocking pool is at least 128 bits. When get > woken up by the reader lock, we would transfer entropy from the input > pool to the blocking pool, and if the pool is not yet initializedm, > and the entropy count is less than 128 bits, we wait until it is. > What happens for me, is that single byte reads out of /dev/random are able to pull so much entropy out of the input_pool that the input_pool never reaches 128 bit and therefore crng_ready will never be true. Therefore I want to delay the blocking_pool until the CRNG is fully initialized. I hope you agree about that. When that is done, there are two ways how entropy is transferred from the input_pool to the blocking_pool, just to confirm I have not overlooked something: /* If the input pool is getting full, send some * entropy to the blocking pool until it is 75% full. */ if (entropy_bits > random_write_wakeup_bits && r->initialized && r->entropy_total >= 2*random_read_wakeup_bits) { struct entropy_store *other = _pool; if (other->entropy_count <= 3 * other->poolinfo->poolfracbits / 4) { schedule_work(>push_work); r->entropy_total = 0; } } this rarely happens, before crng_ready because it requires more than 128 bits of entropy, at least in my setup. The other path is: _random_read->extract_entropy_user->xfer_secondary_pool which pulls at least to random_read_wakeup_bits / 8 from the input_pool to the blocking_pool, and the same amount of entropy is extracted again from the blocking_pool so the blocking_pool's entropy_count effectively stays at zero, when this path is taken. That you don't like, right? You propose to disable the second path until the first path has pulled 128 bits into the blocking_pool, right? Thanks Bernd.
[PATCHv2] random: Make /dev/random wait for crng_ready
Reading from /dev/random may return data while the getrandom syscall is still blocking. Those bytes are not yet cryptographically secure. The first byte from /dev/random can have as little as 8 bits entropy estimation. Once a read blocks, it will block until /proc/sys/kernel/random/read_wakeup_threshold bits are available, which is usually 64 bits, but can be configured as low as 8 bits. A select will wake up when at least read_wakeup_threshold bits are available. Also when constantly reading bytes out of /dev/random it will prevent the crng init done event forever. Fixed by making read and select on /dev/random wait until the crng is fully initialized. Signed-off-by: Bernd Edlinger --- v2: Fixed one white space in the code. Also added some more details about the problem to the commit message. --- drivers/char/random.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index bd449ad..c8f16f0 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -721,7 +721,7 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) } /* should we wake readers? */ - if (entropy_bits >= random_read_wakeup_bits && + if (crng_ready() && entropy_bits >= random_read_wakeup_bits && wq_has_sleeper(_read_wait)) { wake_up_interruptible(_read_wait); kill_fasync(, SIGIO, POLL_IN); @@ -1826,7 +1826,9 @@ _random_read(int nonblock, char __user *buf, size_t nbytes) nbytes = min_t(size_t, nbytes, SEC_XFER_SIZE); while (1) { - n = extract_entropy_user(_pool, buf, nbytes); + n = crng_ready() + ? extract_entropy_user(_pool, buf, nbytes) + : 0; if (n < 0) return n; trace_random_read(n*8, (nbytes-n)*8, @@ -1840,6 +1842,7 @@ _random_read(int nonblock, char __user *buf, size_t nbytes) return -EAGAIN; wait_event_interruptible(random_read_wait, + crng_ready() && ENTROPY_BITS(_pool) >= random_read_wakeup_bits); if (signal_pending(current)) @@ -1884,7 +1887,8 @@ random_poll(struct file *file, poll_table * wait) poll_wait(file, _read_wait, wait); poll_wait(file, _write_wait, wait); mask = 0; - if (ENTROPY_BITS(_pool) >= random_read_wakeup_bits) + if (crng_ready() && + ENTROPY_BITS(_pool) >= random_read_wakeup_bits) mask |= EPOLLIN | EPOLLRDNORM; if (ENTROPY_BITS(_pool) < random_write_wakeup_bits) mask |= EPOLLOUT | EPOLLWRNORM; -- 2.7.4
Re: [PATCH] random: Make /dev/random wait for crng_ready
On 2/15/19 2:47 AM, Bernd Edlinger wrote: > while (1) { > - n = extract_entropy_user(_pool, buf, nbytes); > + n = crng_ready() > + ? extract_entropy_user(_pool, buf, nbytes) > + : 0; Aehm, the whitespace after ? does not align with :, didn't see that before. I wonder if it would be better style to use if (crng_ready()) n = extract...; else n = 0; Bernd.
[PATCH] random: Make /dev/random wait for crng_ready
Reading from /dev/random may return data while the getrandom syscall is still blocking. Those bytes are not yet cryptographically secure. Make read and select for reading on /dev/random wait until the crng is fully initialized. Signed-off-by: Bernd Edlinger --- drivers/char/random.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 38c6d1a..ddbc0e2 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -720,7 +720,7 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits) } /* should we wake readers? */ - if (entropy_bits >= random_read_wakeup_bits && + if (crng_ready() && entropy_bits >= random_read_wakeup_bits && wq_has_sleeper(_read_wait)) { wake_up_interruptible(_read_wait); kill_fasync(, SIGIO, POLL_IN); @@ -1851,7 +1851,9 @@ void rand_initialize_disk(struct gendisk *disk) nbytes = min_t(size_t, nbytes, SEC_XFER_SIZE); while (1) { - n = extract_entropy_user(_pool, buf, nbytes); + n = crng_ready() + ? extract_entropy_user(_pool, buf, nbytes) + : 0; if (n < 0) return n; trace_random_read(n*8, (nbytes-n)*8, @@ -1865,6 +1867,7 @@ void rand_initialize_disk(struct gendisk *disk) return -EAGAIN; wait_event_interruptible(random_read_wait, + crng_ready() && ENTROPY_BITS(_pool) >= random_read_wakeup_bits); if (signal_pending(current)) @@ -1909,7 +1912,8 @@ void rand_initialize_disk(struct gendisk *disk) poll_wait(file, _read_wait, wait); poll_wait(file, _write_wait, wait); mask = 0; - if (ENTROPY_BITS(_pool) >= random_read_wakeup_bits) + if (crng_ready() && + ENTROPY_BITS(_pool) >= random_read_wakeup_bits) mask |= EPOLLIN | EPOLLRDNORM; if (ENTROPY_BITS(_pool) < random_write_wakeup_bits) mask |= EPOLLOUT | EPOLLWRNORM; -- 1.9.1
Re: [RFC][PATCH] Update -Wattribute-alias for gcc9
On 1/25/19 1:24 PM, Bernd Edlinger wrote: > On 1/25/19 12:39 PM, Miguel Ojeda wrote: >> On Fri, Jan 25, 2019 at 11:58 AM Arnd Bergmann wrote: >>> >>> On Fri, Jan 25, 2019 at 11:43 AM Laura Abbott wrote: >>>> >>>> Commit bee20031772a ("disable -Wattribute-alias warning for >>>> SYSCALL_DEFINEx()") disabled -Wattribute-alias with gcc8. >>>> gcc9 changed the format of -Wattribute-alias to take a parameter. >>>> This doesn't quite match with the existing disabling mechanism >>>> so update for gcc9 to match with the default (-Wattribute-alias=1). >>>> >>>> Signed-off-by: Laura Abbott >>>> --- >>>> This is RFC because it feels ugly. I went ahead and did the obvious fixup >>>> but it's worth discussing if we're going to end up with an explosion or >>>> if there's a better way to handle this in one macro. >>> >>> Bernd Edlinger has sent a patch to gcc for this: >>> https://gcc.gnu.org/ml/gcc-patches/2019-01/msg01120.html >>> >>> and Miguel Ojeda said he wanted to send a patch for it to the >>> kernel as well, not sure if he wanted to take a different >>> approach there, so adding both to Cc here. >> >> Thanks Arnd (I was working with Martin on the expanded >> -Wmissing-attribute warnings, not on this, but thanks nevertheless :). >> >> Martin/Bernd: from the GCC mailing list I am not sure if we should >> expect the old behavior to be maintained or not. >> > > I believe it is not intentional to break the old syntax of the > pragma. There will be new -Wattribute-alias=1 and -Wattribute-alias=2 > and -Wattribute-alias is easy to retain as an alias for -Wattribute-alias=1. > That is what my patch will do. > Okay, I committed the -Wattribute-alias patch to gcc trunk, now as https://gcc.gnu.org/viewcvs/gcc?view=revision=268336 . So there will be no need for a workaround on your side. Also fixed a few false positive -Waddress-of-packed-member warnings with https://gcc.gnu.org/viewcvs/gcc?view=revision=268118 and https://gcc.gnu.org/viewcvs/gcc?view=revision=268337 . However there remain a lot of warnings from -Waddress-of-packed-member, that look more or less valid, has anybody an idea how to handle these? Bernd.
Re: [RFC][PATCH] Update -Wattribute-alias for gcc9
On 1/25/19 12:39 PM, Miguel Ojeda wrote: > On Fri, Jan 25, 2019 at 11:58 AM Arnd Bergmann wrote: >> >> On Fri, Jan 25, 2019 at 11:43 AM Laura Abbott wrote: >>> >>> Commit bee20031772a ("disable -Wattribute-alias warning for >>> SYSCALL_DEFINEx()") disabled -Wattribute-alias with gcc8. >>> gcc9 changed the format of -Wattribute-alias to take a parameter. >>> This doesn't quite match with the existing disabling mechanism >>> so update for gcc9 to match with the default (-Wattribute-alias=1). >>> >>> Signed-off-by: Laura Abbott >>> --- >>> This is RFC because it feels ugly. I went ahead and did the obvious fixup >>> but it's worth discussing if we're going to end up with an explosion or >>> if there's a better way to handle this in one macro. >> >> Bernd Edlinger has sent a patch to gcc for this: >> https://gcc.gnu.org/ml/gcc-patches/2019-01/msg01120.html >> >> and Miguel Ojeda said he wanted to send a patch for it to the >> kernel as well, not sure if he wanted to take a different >> approach there, so adding both to Cc here. > > Thanks Arnd (I was working with Martin on the expanded > -Wmissing-attribute warnings, not on this, but thanks nevertheless :). > > Martin/Bernd: from the GCC mailing list I am not sure if we should > expect the old behavior to be maintained or not. > I believe it is not intentional to break the old syntax of the pragma. There will be new -Wattribute-alias=1 and -Wattribute-alias=2 and -Wattribute-alias is easy to retain as an alias for -Wattribute-alias=1. That is what my patch will do. Bernd.
[PATCH] rt61pci: Work around a firmware bug with shared keys
Apparently the rt2x61 firmware fails temporarily to decode broadcast packets if the shared keys are not assigned in the "correct" sequence. At the same time unicast packets work fine, since they are encrypted with the pairwise key. At least with WPA2 CCMP mode the shared keys are set in the following sequence: keyidx=1, 2, 1, 2. After a while only keyidx 2 gets decrypted, and keyidx 1 is ignored, probably because there is never a keyidx 3. Symptoms are arping -b works for 10 minutes, since keyidx=2 is used for broadcast, and then it stops working for 10 minutes, because keyidx=1 is used. That failure mode repeats forever. Note, the firmware does not even know which keyidx corresponds to which hw_key_idx so the firmware is trying to be smarter than the driver, which is bound to fail. As workaround the function rt61pci_config_shared_key requests software decryption of the shared keys, by returning EOPNOTSUPP. However, pairwise keys are still handled by hardware which works just fine. Signed-off-by: Bernd Edlinger --- drivers/net/wireless/ralink/rt2x00/rt61pci.c | 93 ++-- 1 file changed, 4 insertions(+), 89 deletions(-) diff --git a/drivers/net/wireless/ralink/rt2x00/rt61pci.c b/drivers/net/wireless/ralink/rt2x00/rt61pci.c index 4c5de8f..52b9fc4 100644 --- a/drivers/net/wireless/ralink/rt2x00/rt61pci.c +++ b/drivers/net/wireless/ralink/rt2x00/rt61pci.c @@ -321,97 +321,12 @@ static int rt61pci_config_shared_key(struct rt2x00_dev *rt2x00dev, struct rt2x00lib_crypto *crypto, struct ieee80211_key_conf *key) { - struct hw_key_entry key_entry; - struct rt2x00_field32 field; - u32 mask; - u32 reg; - - if (crypto->cmd == SET_KEY) { - /* -* rt2x00lib can't determine the correct free -* key_idx for shared keys. We have 1 register -* with key valid bits. The goal is simple, read -* the register, if that is full we have no slots -* left. -* Note that each BSS is allowed to have up to 4 -* shared keys, so put a mask over the allowed -* entries. -*/ - mask = (0xf << crypto->bssidx); - - reg = rt2x00mmio_register_read(rt2x00dev, SEC_CSR0); - reg &= mask; - - if (reg && reg == mask) - return -ENOSPC; - - key->hw_key_idx += reg ? ffz(reg) : 0; - - /* -* Upload key to hardware -*/ - memcpy(key_entry.key, crypto->key, - sizeof(key_entry.key)); - memcpy(key_entry.tx_mic, crypto->tx_mic, - sizeof(key_entry.tx_mic)); - memcpy(key_entry.rx_mic, crypto->rx_mic, - sizeof(key_entry.rx_mic)); - - reg = SHARED_KEY_ENTRY(key->hw_key_idx); - rt2x00mmio_register_multiwrite(rt2x00dev, reg, - _entry, sizeof(key_entry)); - - /* -* The cipher types are stored over 2 registers. -* bssidx 0 and 1 keys are stored in SEC_CSR1 and -* bssidx 1 and 2 keys are stored in SEC_CSR5. -* Using the correct defines correctly will cause overhead, -* so just calculate the correct offset. -*/ - if (key->hw_key_idx < 8) { - field.bit_offset = (3 * key->hw_key_idx); - field.bit_mask = 0x7 << field.bit_offset; - - reg = rt2x00mmio_register_read(rt2x00dev, SEC_CSR1); - rt2x00_set_field32(, field, crypto->cipher); - rt2x00mmio_register_write(rt2x00dev, SEC_CSR1, reg); - } else { - field.bit_offset = (3 * (key->hw_key_idx - 8)); - field.bit_mask = 0x7 << field.bit_offset; - - reg = rt2x00mmio_register_read(rt2x00dev, SEC_CSR5); - rt2x00_set_field32(, field, crypto->cipher); - rt2x00mmio_register_write(rt2x00dev, SEC_CSR5, reg); - } - - /* -* The driver does not support the IV/EIV generation -* in hardware. However it doesn't support the IV/EIV -* inside the ieee80211 frame either, but requires it -* to be provided separately for the descriptor. -* rt2x00lib will cut the IV/EIV data out of all frames -* given to us by mac80211, but we must tell mac80211 -* to generate the IV/EIV data. -*/ - key->flags |= IEEE80211_KEY_FLAG_GENERATE_IV; - } - /* -
[PATCH v4 4/4] rtlwifi: Don't clear num_rx_inperiod too early
This patch moves the clearing of rtlpriv->link_info.num_rx_inperiod in rtl_watchdog_wq_callback a few lines down. This is necessary since it is still used in the "AP off" detection code block. Moved clearing of rtlpriv->link_info.num_rx_inperiod as well for consistency. Signed-off-by: Bernd Edlinger --- v2: Improved patch description. v3: Make the title fit in one line. v4: Try to fix the line endings the message body. --- drivers/net/wireless/realtek/rtlwifi/base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/base.c b/drivers/net/wireless/realtek/rtlwifi/base.c index ef9b502..7aa68fe 100644 --- a/drivers/net/wireless/realtek/rtlwifi/base.c +++ b/drivers/net/wireless/realtek/rtlwifi/base.c @@ -2172,8 +2172,6 @@ void rtl_watchdog_wq_callback(void *data) ; } - rtlpriv->link_info.num_rx_inperiod = 0; - rtlpriv->link_info.num_tx_inperiod = 0; for (tid = 0; tid <= 7; tid++) rtlpriv->link_info.tidtx_inperiod[tid] = 0; @@ -2236,6 +2234,8 @@ void rtl_watchdog_wq_callback(void *data) rtlpriv->btcoexist.btc_info.in_4way = false; } + rtlpriv->link_info.num_rx_inperiod = 0; + rtlpriv->link_info.num_tx_inperiod = 0; rtlpriv->link_info.bcn_rx_inperiod = 0; /* <6> scan list */ -- 1.9.1
[PATCH v4 3/4] rtl8723ae: Re-introduce the adaptive rate control
This re-introduces the function rtl8723e_dm_refresh_rate_adaptive_mask. This function was present in a previous version of the code base, it works just fine for me -- as long as it is not using stale data. Unlike the original version of this function it avoids using dm.undec_sm_pwdb when no beacon was received. Fixed a style nit in rtl8723e_dm_init_rate_adaptive_mask. Signed-off-by: Bernd Edlinger --- v2: Improve patch description. v3: Improve patch description, mention that dm.undec_sm_pwdb is not used when no beacon received. Make the title fit in one line. v4: Try to fix the line endings the message body. --- .../net/wireless/realtek/rtlwifi/rtl8723ae/dm.c| 87 +- 1 file changed, 85 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index 902b944..acfd54c 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -673,7 +673,7 @@ void rtl8723e_dm_check_txpower_tracking(struct ieee80211_hw *hw) void rtl8723e_dm_init_rate_adaptive_mask(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); - struct rate_adaptive *p_ra = &(rtlpriv->ra); + struct rate_adaptive *p_ra = >ra; p_ra->ratr_state = DM_RATR_STA_INIT; p_ra->pre_ratr_state = DM_RATR_STA_INIT; @@ -685,6 +685,89 @@ void rtl8723e_dm_init_rate_adaptive_mask(struct ieee80211_hw *hw) } +void rtl8723e_dm_refresh_rate_adaptive_mask(struct ieee80211_hw *hw) +{ + struct rtl_priv *rtlpriv = rtl_priv(hw); + struct rtl_hal *rtlhal = rtl_hal(rtl_priv(hw)); + struct rtl_mac *mac = rtl_mac(rtl_priv(hw)); + struct rate_adaptive *p_ra = >ra; + u32 low_rssithresh_for_ra, high_rssithresh_for_ra; + struct ieee80211_sta *sta = NULL; + + if (is_hal_stop(rtlhal)) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +" driver is going to unload\n"); + return; + } + + if (!rtlpriv->dm.useramask) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +" driver does not control rate adaptive mask\n"); + return; + } + + if (mac->link_state == MAC80211_LINKED && + mac->opmode == NL80211_IFTYPE_STATION) { + switch (p_ra->pre_ratr_state) { + case DM_RATR_STA_HIGH: + high_rssithresh_for_ra = 50; + low_rssithresh_for_ra = 20; + break; + case DM_RATR_STA_MIDDLE: + high_rssithresh_for_ra = 55; + low_rssithresh_for_ra = 20; + break; + case DM_RATR_STA_LOW: + high_rssithresh_for_ra = 60; + low_rssithresh_for_ra = 25; + break; + default: + high_rssithresh_for_ra = 50; + low_rssithresh_for_ra = 20; + break; + } + + if (rtlpriv->link_info.bcn_rx_inperiod == 0) + switch (p_ra->pre_ratr_state) { + case DM_RATR_STA_HIGH: + default: + p_ra->ratr_state = DM_RATR_STA_MIDDLE; + break; + case DM_RATR_STA_MIDDLE: + case DM_RATR_STA_LOW: + p_ra->ratr_state = DM_RATR_STA_LOW; + break; + } + else if (rtlpriv->dm.undec_sm_pwdb > high_rssithresh_for_ra) + p_ra->ratr_state = DM_RATR_STA_HIGH; + else if (rtlpriv->dm.undec_sm_pwdb > low_rssithresh_for_ra) + p_ra->ratr_state = DM_RATR_STA_MIDDLE; + else + p_ra->ratr_state = DM_RATR_STA_LOW; + + if (p_ra->pre_ratr_state != p_ra->ratr_state) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"RSSI = %ld\n", +rtlpriv->dm.undec_sm_pwdb); + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"RSSI_LEVEL = %d\n", p_ra->ratr_state); + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"PreState = %d, CurState = %d\n", +p_ra->pre_ratr_state, p_ra->ratr_state); + + rcu_read_lock(); + sta = rtl_find_sta(hw, mac->bssid); + if (sta) + rtlpriv->cfg->ops->update_rate_tbl(hw, sta,
[PATCH v4 2/4] rtl8723ae: Dont use old data for input gain control
When no beacon was received, the value in dm.undec_sm_pwdb is most likely out of date and should not be used to adjust the input path. Assume instead that the signal level is low. Fix the state machine in rtl8723e_dm_cck_packet_detection_thresh which did not clear pre_cck_fa_state when changing cur_cck_pd_state from CCK_PD_STAGE_LOWRSSI/CCK_FA_STAGE_LOW to CCK_PD_STAGE_HIGHRSSI and back again to CCK_PD_STAGE_LOWRSSI/CCK_FA_STAGE_LOW, the register RCCK0_CCA not written to 0x83 on the second change. Explicitly initialize pre_cck_fa_state/cur_cck_fa_state in rtl_dm_diginit. Signed-off-by: Bernd Edlinger --- v2: Improved patch description. v3: Make the title fit in one line. v4: Try to fix the line endings the message body. --- drivers/net/wireless/realtek/rtlwifi/core.c | 2 ++ drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c | 8 2 files changed, 10 insertions(+) diff --git a/drivers/net/wireless/realtek/rtlwifi/core.c b/drivers/net/wireless/realtek/rtlwifi/core.c index 4bf7967..ce23339 100644 --- a/drivers/net/wireless/realtek/rtlwifi/core.c +++ b/drivers/net/wireless/realtek/rtlwifi/core.c @@ -1957,5 +1957,7 @@ void rtl_dm_diginit(struct ieee80211_hw *hw, u32 cur_igvalue) dm_digtable->bt30_cur_igi = 0x32; dm_digtable->pre_cck_pd_state = CCK_PD_STAGE_MAX; dm_digtable->cur_cck_pd_state = CCK_PD_STAGE_LOWRSSI; + dm_digtable->pre_cck_fa_state = 0; + dm_digtable->cur_cck_fa_state = 0; } EXPORT_SYMBOL(rtl_dm_diginit); diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index 42a6fba..902b944 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -151,8 +151,14 @@ static u8 rtl8723e_dm_initial_gain_min_pwdb(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); struct dig_t *dm_digtable = >dm_digtable; + struct rtl_mac *mac = rtl_mac(rtlpriv); long rssi_val_min = 0; + if (mac->link_state == MAC80211_LINKED && + mac->opmode == NL80211_IFTYPE_STATION && + rtlpriv->link_info.bcn_rx_inperiod == 0) + return 0; + if ((dm_digtable->curmultista_cstate == DIG_MULTISTA_CONNECT) && (dm_digtable->cursta_cstate == DIG_STA_CONNECT)) { if (rtlpriv->dm.entry_min_undec_sm_pwdb != 0) @@ -417,6 +423,8 @@ static void rtl8723e_dm_cck_packet_detection_thresh(struct ieee80211_hw *hw) } else { rtl_set_bbreg(hw, RCCK0_CCA, MASKBYTE2, 0xcd); rtl_set_bbreg(hw, RCCK0_SYSTEM, MASKBYTE1, 0x47); + dm_digtable->pre_cck_fa_state = 0; + dm_digtable->cur_cck_fa_state = 0; } dm_digtable->pre_cck_pd_state = dm_digtable->cur_cck_pd_state; -- 1.9.1
[PATCH v4 1/4] rtl8723ae: Take the FW LPS mode handling out
This appears to trigger a firmware bug and causes severe problems with rtl8723ae PCI devices. When the power save mode is activated for longer periods of time the firmware stops to receive any packets. This problem was exposed by commit 873ffe154ae0 ("rtlwifi: Fix logic error in enter/exit power-save mode"). Previously the power save mode was only active rarely and only for a short time so that the problem was not noticeable. Signed-off-by: Bernd Edlinger --- v2: Adjust the defaults of swlps and fwlps module parameters to match the firmware capabilities instead of removing the whole code, so it can be easily re-activated once a firmware update is available. v3: Make the title fit in one line. v4: Try to fix the line endings the message body. --- drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c index 07b82700d1..3103151 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c @@ -266,8 +266,8 @@ static bool is_fw_header(struct rtlwifi_firmware_header *hdr) static struct rtl_mod_params rtl8723e_mod_params = { .sw_crypto = false, .inactiveps = true, - .swctrl_lps = false, - .fwctrl_lps = true, + .swctrl_lps = true, + .fwctrl_lps = false, .aspm_support = 1, .debug_level = 0, .debug_mask = 0, @@ -395,8 +395,8 @@ static bool is_fw_header(struct rtlwifi_firmware_header *hdr) bool, 0444); MODULE_PARM_DESC(swenc, "Set to 1 for software crypto (default 0)\n"); MODULE_PARM_DESC(ips, "Set to 0 to not use link power save (default 1)\n"); -MODULE_PARM_DESC(swlps, "Set to 1 to use SW control power save (default 0)\n"); -MODULE_PARM_DESC(fwlps, "Set to 1 to use FW control power save (default 1)\n"); +MODULE_PARM_DESC(swlps, "Set to 1 to use SW control power save (default 1)\n"); +MODULE_PARM_DESC(fwlps, "Set to 1 to use FW control power save (default 0)\n"); MODULE_PARM_DESC(msi, "Set to 1 to use MSI interrupts mode (default 0)\n"); MODULE_PARM_DESC(aspm, "Set to 1 to enable ASPM (default 1)\n"); MODULE_PARM_DESC(debug_level, "Set debug level (0-5) (default 0)"); -- 1.9.1
[PATCH v4 0/4] rtlwifi: Fix issues with rtl8723ae
Currently the rtl8723ae driver is broken (since v4.7). Connection to AP is lost very often, especially when the signal level is not very good. The main issue is the power save mode is basically not working, and seems to trigger a firmware bug. So I had to take out the FW LPS mode handling. While debugging the driver I found a couple related issues, for instance that the signal level in dm.undec_sm_pwdb is no longer accurate (may be even much too high) when no more packets are received, and it increases the likelihood to receive something if the input gain is set to maximum. The patch was tested with the rtl8723ae PCI card in my laptop against a FRITZ!Box 7590 AP -- the WiFi connection works now very reliable for me. ChangeLog: v2: Adjusts the defaults of swlps and fwlps module parameters to match the firmware capabilities instead of removing the whole code, so it can be easily re-activated once a firmware update is available. v3: Make the title of each patch fit in one line. v4: Try to fix the line endings the message body. Which is an exchange server issue. The patch does not change at all. Bernd Edlinger (4): rtl8723ae: Take the FW LPS mode handling out rtl8723ae: Dont use old data for input gain control rtl8723ae: Re-introduce the adaptive rate control rtlwifi: Don't clear num_rx_inperiod too early drivers/net/wireless/realtek/rtlwifi/base.c| 4 +- drivers/net/wireless/realtek/rtlwifi/core.c| 2 + .../net/wireless/realtek/rtlwifi/rtl8723ae/dm.c| 95 +- .../net/wireless/realtek/rtlwifi/rtl8723ae/sw.c| 8 +- 4 files changed, 101 insertions(+), 8 deletions(-) -- 1.9.1
Re: [PATCH v3 1/4] rtl8723ae: Take the FW LPS mode handling out
On 1/8/19 6:27 PM, Kalle Valo wrote: > Bernd Edlinger writes: > >> This appears to trigger a firmware bug and causes severe >> problems with rtl8723ae PCI devices. >> >> When the power save mode is activated for longer periods >> of time the firmware stops to receive any packets. >> >> This problem was exposed by commit 873ffe154ae0 ("rtlwifi: >> Fix logic error in enter/exit power-save mode"). >> >> Previously the power save mode was only active rarely and >> only for a short time so that the problem was not noticeable. >> >> Signed-off-by: Bernd Edlinger >> --- >> v2: Adjust the defaults of swlps and fwlps module >> parameters to match the firmware capabilities instead of removing >> the whole code, so it can be easily re-activated once a firmware >> update is available. >> >> v3: Make the title fit in one line. > > hotmail.de made me suspicious and indeed my Spidey sense was right: > > Applying: rtl8723ae: Take the FW LPS mode handling out > Using index info to reconstruct a base tree... > .git/rebase-apply/patch:22: trailing whitespace. > .swctrl_lps = true, > .git/rebase-apply/patch:23: trailing whitespace. > .fwctrl_lps = false, > .git/rebase-apply/patch:33: trailing whitespace. > MODULE_PARM_DESC(swlps, "Set to 1 to use SW control power save (default > 1)\n"); > .git/rebase-apply/patch:34: trailing whitespace. > MODULE_PARM_DESC(fwlps, "Set to 1 to use FW control power save (default > 0)\n"); > error: patch failed: drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c:266 > error: drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c: patch does not > apply > error: Did you hand edit your patch? > It does not apply to blobs recorded in its index. > Patch failed at 0001 rtl8723ae: Take the FW LPS mode handling out > > Did you use Outlook to send this patch or what? Anyway, it's strongly > suggested to use git to submit the patches (and not use Outlook servers > either as they are notorious in breaking our patches): > No, I did use firefox with the settings which I found at https://www.kernel.org/doc/html/latest/process/email-clients.html#email-clients I think all that is wrong with the message body is it is base64 encoded, and all the lines end with CRLF, and it seems the CR is the trailing whitespace. I think the Character Encoding/UTF-8 does not work right with Exchange Servers. What Character Encoding would git send-email use? UTF-8? So I will try to send the patches again, with 7-bit encoded message bodies. Actually it will be quoted-printable, but I do not use any umlauts. So for the records, I did open about:config in the firefox preferences, and change: mailnews.send_default_charset from UTF-8 to ISO-8859-1 that I have already done before, but it was not sufficient: mailnews.send_plaintext_flowed from true to false mailnews.wraplength from 72 to 0 Thanks Bernd.
[PATCH v3 4/4] rtlwifi: Don't clear num_rx_inperiod too early
This patch moves the clearing of rtlpriv->link_info.num_rx_inperiod in rtl_watchdog_wq_callback a few lines down. This is necessary since it is still used in the "AP off" detection code block. Moved clearing of rtlpriv->link_info.num_rx_inperiod as well for consistency. Signed-off-by: Bernd Edlinger --- v2: Improved patch description. v3: Make the title fit in one line. --- drivers/net/wireless/realtek/rtlwifi/base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/base.c b/drivers/net/wireless/realtek/rtlwifi/base.c index ef9b502..7aa68fe 100644 --- a/drivers/net/wireless/realtek/rtlwifi/base.c +++ b/drivers/net/wireless/realtek/rtlwifi/base.c @@ -2172,8 +2172,6 @@ void rtl_watchdog_wq_callback(void *data) ; } - rtlpriv->link_info.num_rx_inperiod = 0; - rtlpriv->link_info.num_tx_inperiod = 0; for (tid = 0; tid <= 7; tid++) rtlpriv->link_info.tidtx_inperiod[tid] = 0; @@ -2236,6 +2234,8 @@ void rtl_watchdog_wq_callback(void *data) rtlpriv->btcoexist.btc_info.in_4way = false; } + rtlpriv->link_info.num_rx_inperiod = 0; + rtlpriv->link_info.num_tx_inperiod = 0; rtlpriv->link_info.bcn_rx_inperiod = 0; /* <6> scan list */ -- 1.9.1
[PATCH v3 3/4] rtl8723ae: Re-introduce the adaptive rate control
This re-introduces the function rtl8723e_dm_refresh_rate_adaptive_mask. This function was present in a previous version of the code base, it works just fine for me -- as long as it is not using stale data. Unlike the original version of this function it avoids using dm.undec_sm_pwdb when no beacon was received. Fixed a style nit in rtl8723e_dm_init_rate_adaptive_mask. Signed-off-by: Bernd Edlinger --- v2: Improve patch description. v3: Improve patch description, mention that dm.undec_sm_pwdb is not used when no beacon received. Make the title fit in one line. --- .../net/wireless/realtek/rtlwifi/rtl8723ae/dm.c| 87 +- 1 file changed, 85 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index 902b944..acfd54c 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -673,7 +673,7 @@ void rtl8723e_dm_check_txpower_tracking(struct ieee80211_hw *hw) void rtl8723e_dm_init_rate_adaptive_mask(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); - struct rate_adaptive *p_ra = &(rtlpriv->ra); + struct rate_adaptive *p_ra = >ra; p_ra->ratr_state = DM_RATR_STA_INIT; p_ra->pre_ratr_state = DM_RATR_STA_INIT; @@ -685,6 +685,89 @@ void rtl8723e_dm_init_rate_adaptive_mask(struct ieee80211_hw *hw) } +void rtl8723e_dm_refresh_rate_adaptive_mask(struct ieee80211_hw *hw) +{ + struct rtl_priv *rtlpriv = rtl_priv(hw); + struct rtl_hal *rtlhal = rtl_hal(rtl_priv(hw)); + struct rtl_mac *mac = rtl_mac(rtl_priv(hw)); + struct rate_adaptive *p_ra = >ra; + u32 low_rssithresh_for_ra, high_rssithresh_for_ra; + struct ieee80211_sta *sta = NULL; + + if (is_hal_stop(rtlhal)) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +" driver is going to unload\n"); + return; + } + + if (!rtlpriv->dm.useramask) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +" driver does not control rate adaptive mask\n"); + return; + } + + if (mac->link_state == MAC80211_LINKED && + mac->opmode == NL80211_IFTYPE_STATION) { + switch (p_ra->pre_ratr_state) { + case DM_RATR_STA_HIGH: + high_rssithresh_for_ra = 50; + low_rssithresh_for_ra = 20; + break; + case DM_RATR_STA_MIDDLE: + high_rssithresh_for_ra = 55; + low_rssithresh_for_ra = 20; + break; + case DM_RATR_STA_LOW: + high_rssithresh_for_ra = 60; + low_rssithresh_for_ra = 25; + break; + default: + high_rssithresh_for_ra = 50; + low_rssithresh_for_ra = 20; + break; + } + + if (rtlpriv->link_info.bcn_rx_inperiod == 0) + switch (p_ra->pre_ratr_state) { + case DM_RATR_STA_HIGH: + default: + p_ra->ratr_state = DM_RATR_STA_MIDDLE; + break; + case DM_RATR_STA_MIDDLE: + case DM_RATR_STA_LOW: + p_ra->ratr_state = DM_RATR_STA_LOW; + break; + } + else if (rtlpriv->dm.undec_sm_pwdb > high_rssithresh_for_ra) + p_ra->ratr_state = DM_RATR_STA_HIGH; + else if (rtlpriv->dm.undec_sm_pwdb > low_rssithresh_for_ra) + p_ra->ratr_state = DM_RATR_STA_MIDDLE; + else + p_ra->ratr_state = DM_RATR_STA_LOW; + + if (p_ra->pre_ratr_state != p_ra->ratr_state) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"RSSI = %ld\n", +rtlpriv->dm.undec_sm_pwdb); + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"RSSI_LEVEL = %d\n", p_ra->ratr_state); + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"PreState = %d, CurState = %d\n", +p_ra->pre_ratr_state, p_ra->ratr_state); + + rcu_read_lock(); + sta = rtl_find_sta(hw, mac->bssid); + if (sta) + rtlpriv->cfg->ops->update_rate_tbl(hw, sta, + p_ra->ratr_st
[PATCH v3 2/4] rtl8723ae: Dont use old data for input gain control
When no beacon was received, the value in dm.undec_sm_pwdb is most likely out of date and should not be used to adjust the input path. Assume instead that the signal level is low. Fix the state machine in rtl8723e_dm_cck_packet_detection_thresh which did not clear pre_cck_fa_state when changing cur_cck_pd_state from CCK_PD_STAGE_LOWRSSI/CCK_FA_STAGE_LOW to CCK_PD_STAGE_HIGHRSSI and back again to CCK_PD_STAGE_LOWRSSI/CCK_FA_STAGE_LOW, the register RCCK0_CCA not written to 0x83 on the second change. Explicitly initialize pre_cck_fa_state/cur_cck_fa_state in rtl_dm_diginit. Signed-off-by: Bernd Edlinger --- v2: Improved patch description. v3: Make the title fit in one line. --- drivers/net/wireless/realtek/rtlwifi/core.c | 2 ++ drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c | 8 2 files changed, 10 insertions(+) diff --git a/drivers/net/wireless/realtek/rtlwifi/core.c b/drivers/net/wireless/realtek/rtlwifi/core.c index 4bf7967..ce23339 100644 --- a/drivers/net/wireless/realtek/rtlwifi/core.c +++ b/drivers/net/wireless/realtek/rtlwifi/core.c @@ -1957,5 +1957,7 @@ void rtl_dm_diginit(struct ieee80211_hw *hw, u32 cur_igvalue) dm_digtable->bt30_cur_igi = 0x32; dm_digtable->pre_cck_pd_state = CCK_PD_STAGE_MAX; dm_digtable->cur_cck_pd_state = CCK_PD_STAGE_LOWRSSI; + dm_digtable->pre_cck_fa_state = 0; + dm_digtable->cur_cck_fa_state = 0; } EXPORT_SYMBOL(rtl_dm_diginit); diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index 42a6fba..902b944 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -151,8 +151,14 @@ static u8 rtl8723e_dm_initial_gain_min_pwdb(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); struct dig_t *dm_digtable = >dm_digtable; + struct rtl_mac *mac = rtl_mac(rtlpriv); long rssi_val_min = 0; + if (mac->link_state == MAC80211_LINKED && + mac->opmode == NL80211_IFTYPE_STATION && + rtlpriv->link_info.bcn_rx_inperiod == 0) + return 0; + if ((dm_digtable->curmultista_cstate == DIG_MULTISTA_CONNECT) && (dm_digtable->cursta_cstate == DIG_STA_CONNECT)) { if (rtlpriv->dm.entry_min_undec_sm_pwdb != 0) @@ -417,6 +423,8 @@ static void rtl8723e_dm_cck_packet_detection_thresh(struct ieee80211_hw *hw) } else { rtl_set_bbreg(hw, RCCK0_CCA, MASKBYTE2, 0xcd); rtl_set_bbreg(hw, RCCK0_SYSTEM, MASKBYTE1, 0x47); + dm_digtable->pre_cck_fa_state = 0; + dm_digtable->cur_cck_fa_state = 0; } dm_digtable->pre_cck_pd_state = dm_digtable->cur_cck_pd_state; -- 1.9.1
[PATCH v3 1/4] rtl8723ae: Take the FW LPS mode handling out
This appears to trigger a firmware bug and causes severe problems with rtl8723ae PCI devices. When the power save mode is activated for longer periods of time the firmware stops to receive any packets. This problem was exposed by commit 873ffe154ae0 ("rtlwifi: Fix logic error in enter/exit power-save mode"). Previously the power save mode was only active rarely and only for a short time so that the problem was not noticeable. Signed-off-by: Bernd Edlinger --- v2: Adjust the defaults of swlps and fwlps module parameters to match the firmware capabilities instead of removing the whole code, so it can be easily re-activated once a firmware update is available. v3: Make the title fit in one line. --- drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c index 07b82700d1..3103151 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c @@ -266,8 +266,8 @@ static bool is_fw_header(struct rtlwifi_firmware_header *hdr) static struct rtl_mod_params rtl8723e_mod_params = { .sw_crypto = false, .inactiveps = true, - .swctrl_lps = false, - .fwctrl_lps = true, + .swctrl_lps = true, + .fwctrl_lps = false, .aspm_support = 1, .debug_level = 0, .debug_mask = 0, @@ -395,8 +395,8 @@ static bool is_fw_header(struct rtlwifi_firmware_header *hdr) bool, 0444); MODULE_PARM_DESC(swenc, "Set to 1 for software crypto (default 0)\n"); MODULE_PARM_DESC(ips, "Set to 0 to not use link power save (default 1)\n"); -MODULE_PARM_DESC(swlps, "Set to 1 to use SW control power save (default 0)\n"); -MODULE_PARM_DESC(fwlps, "Set to 1 to use FW control power save (default 1)\n"); +MODULE_PARM_DESC(swlps, "Set to 1 to use SW control power save (default 1)\n"); +MODULE_PARM_DESC(fwlps, "Set to 1 to use FW control power save (default 0)\n"); MODULE_PARM_DESC(msi, "Set to 1 to use MSI interrupts mode (default 0)\n"); MODULE_PARM_DESC(aspm, "Set to 1 to enable ASPM (default 1)\n"); MODULE_PARM_DESC(debug_level, "Set debug level (0-5) (default 0)"); -- 1.9.1
[PATCH v3 0/4] rtlwifi: Fix issues with rtl8723ae
Currently the rtl8723ae driver is broken (since v4.7). Connection to AP is lost very often, especially when the signal level is not very good. The main issue is the power save mode is basically not working, and seems to trigger a firmware bug. So I had to take out the FW LPS mode handling. While debugging the driver I found a couple related issues, for instance that the signal level in dm.undec_sm_pwdb is no longer accurate (may be even much too high) when no more packets are received, and it increases the likelihood to receive something if the input gain is set to maximum. The patch was tested with the rtl8723ae PCI card in my laptop against a FRITZ!Box 7590 AP -- the WiFi connection works now very reliable for me. ChangeLog: v2: Adjusts the defaults of swlps and fwlps module parameters to match the firmware capabilities instead of removing the whole code, so it can be easily re-activated once a firmware update is available. v3: Make the title of each patch fit in one line. Bernd Edlinger (4): rtl8723ae: Take the FW LPS mode handling out rtl8723ae: Dont use old data for input gain control rtl8723ae: Re-introduce the adaptive rate control rtlwifi: Don't clear num_rx_inperiod too early drivers/net/wireless/realtek/rtlwifi/base.c| 4 +- drivers/net/wireless/realtek/rtlwifi/core.c| 2 + .../net/wireless/realtek/rtlwifi/rtl8723ae/dm.c| 95 +- .../net/wireless/realtek/rtlwifi/rtl8723ae/sw.c| 8 +- 4 files changed, 101 insertions(+), 8 deletions(-) -- 1.9.1
[PATCH v2 3/4] rtlwifi: rtl8723ae: Re-introduce
rtl8723e_dm_refresh_rate_adaptive_mask This function was present in a previous version of the code base, it works just fine for me -- as long as it is not using stale data. Fixed a style nit in rtl8723e_dm_init_rate_adaptive_mask. Signed-off-by: Bernd Edlinger --- .../net/wireless/realtek/rtlwifi/rtl8723ae/dm.c| 87 +- 1 file changed, 85 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index 902b944..acfd54c 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -673,7 +673,7 @@ void rtl8723e_dm_check_txpower_tracking(struct ieee80211_hw *hw) void rtl8723e_dm_init_rate_adaptive_mask(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); - struct rate_adaptive *p_ra = &(rtlpriv->ra); + struct rate_adaptive *p_ra = >ra; p_ra->ratr_state = DM_RATR_STA_INIT; p_ra->pre_ratr_state = DM_RATR_STA_INIT; @@ -685,6 +685,89 @@ void rtl8723e_dm_init_rate_adaptive_mask(struct ieee80211_hw *hw) } +void rtl8723e_dm_refresh_rate_adaptive_mask(struct ieee80211_hw *hw) +{ + struct rtl_priv *rtlpriv = rtl_priv(hw); + struct rtl_hal *rtlhal = rtl_hal(rtl_priv(hw)); + struct rtl_mac *mac = rtl_mac(rtl_priv(hw)); + struct rate_adaptive *p_ra = >ra; + u32 low_rssithresh_for_ra, high_rssithresh_for_ra; + struct ieee80211_sta *sta = NULL; + + if (is_hal_stop(rtlhal)) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +" driver is going to unload\n"); + return; + } + + if (!rtlpriv->dm.useramask) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +" driver does not control rate adaptive mask\n"); + return; + } + + if (mac->link_state == MAC80211_LINKED && + mac->opmode == NL80211_IFTYPE_STATION) { + switch (p_ra->pre_ratr_state) { + case DM_RATR_STA_HIGH: + high_rssithresh_for_ra = 50; + low_rssithresh_for_ra = 20; + break; + case DM_RATR_STA_MIDDLE: + high_rssithresh_for_ra = 55; + low_rssithresh_for_ra = 20; + break; + case DM_RATR_STA_LOW: + high_rssithresh_for_ra = 60; + low_rssithresh_for_ra = 25; + break; + default: + high_rssithresh_for_ra = 50; + low_rssithresh_for_ra = 20; + break; + } + + if (rtlpriv->link_info.bcn_rx_inperiod == 0) + switch (p_ra->pre_ratr_state) { + case DM_RATR_STA_HIGH: + default: + p_ra->ratr_state = DM_RATR_STA_MIDDLE; + break; + case DM_RATR_STA_MIDDLE: + case DM_RATR_STA_LOW: + p_ra->ratr_state = DM_RATR_STA_LOW; + break; + } + else if (rtlpriv->dm.undec_sm_pwdb > high_rssithresh_for_ra) + p_ra->ratr_state = DM_RATR_STA_HIGH; + else if (rtlpriv->dm.undec_sm_pwdb > low_rssithresh_for_ra) + p_ra->ratr_state = DM_RATR_STA_MIDDLE; + else + p_ra->ratr_state = DM_RATR_STA_LOW; + + if (p_ra->pre_ratr_state != p_ra->ratr_state) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"RSSI = %ld\n", +rtlpriv->dm.undec_sm_pwdb); + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"RSSI_LEVEL = %d\n", p_ra->ratr_state); + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"PreState = %d, CurState = %d\n", +p_ra->pre_ratr_state, p_ra->ratr_state); + + rcu_read_lock(); + sta = rtl_find_sta(hw, mac->bssid); + if (sta) + rtlpriv->cfg->ops->update_rate_tbl(hw, sta, + p_ra->ratr_state, + true); + rcu_read_unlock(); + + p_ra->pre_ratr_state = p_ra->ratr_state; + } + } +} + void rtl8723e_dm_rf_saving(struct
[PATCH v2 4/4] rtlwifi: Move the clearing of
rtlpriv->link_info.num_rx_inperiod in rtl_watchdog_wq_callback a few lines down This is necessary since it is still used in the "AP off" detection code block. Moved clearing of rtlpriv->link_info.num_rx_inperiod as well for consistency. Signed-off-by: Bernd Edlinger --- drivers/net/wireless/realtek/rtlwifi/base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/base.c b/drivers/net/wireless/realtek/rtlwifi/base.c index ef9b502..7aa68fe 100644 --- a/drivers/net/wireless/realtek/rtlwifi/base.c +++ b/drivers/net/wireless/realtek/rtlwifi/base.c @@ -2172,8 +2172,6 @@ void rtl_watchdog_wq_callback(void *data) ; } - rtlpriv->link_info.num_rx_inperiod = 0; - rtlpriv->link_info.num_tx_inperiod = 0; for (tid = 0; tid <= 7; tid++) rtlpriv->link_info.tidtx_inperiod[tid] = 0; @@ -2236,6 +2234,8 @@ void rtl_watchdog_wq_callback(void *data) rtlpriv->btcoexist.btc_info.in_4way = false; } + rtlpriv->link_info.num_rx_inperiod = 0; + rtlpriv->link_info.num_tx_inperiod = 0; rtlpriv->link_info.bcn_rx_inperiod = 0; /* <6> scan list */ -- 1.9.1
[PATCH v2 2/4] rtlwifi: rtl8723ae: Don't use dm.undec_sm_pwdb for input
gain control when no beacon was received in the connected state When no beacon was received, the value in dm.undec_sm_pwdb is most likely out of date and should not be used to adjust the input path. Assume instead that the signal level is low. Fix the state machine in rtl8723e_dm_cck_packet_detection_thresh which did not clear pre_cck_fa_state when changing cur_cck_pd_state from CCK_PD_STAGE_LOWRSSI/CCK_FA_STAGE_LOW to CCK_PD_STAGE_HIGHRSSI and back again to CCK_PD_STAGE_LOWRSSI/CCK_FA_STAGE_LOW, the register RCCK0_CCA not written to 0x83 on the second change. Explicitly initialize pre_cck_fa_state/cur_cck_fa_state in rtl_dm_diginit. Signed-off-by: Bernd Edlinger --- drivers/net/wireless/realtek/rtlwifi/core.c | 2 ++ drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c | 8 2 files changed, 10 insertions(+) diff --git a/drivers/net/wireless/realtek/rtlwifi/core.c b/drivers/net/wireless/realtek/rtlwifi/core.c index 4bf7967..ce23339 100644 --- a/drivers/net/wireless/realtek/rtlwifi/core.c +++ b/drivers/net/wireless/realtek/rtlwifi/core.c @@ -1957,5 +1957,7 @@ void rtl_dm_diginit(struct ieee80211_hw *hw, u32 cur_igvalue) dm_digtable->bt30_cur_igi = 0x32; dm_digtable->pre_cck_pd_state = CCK_PD_STAGE_MAX; dm_digtable->cur_cck_pd_state = CCK_PD_STAGE_LOWRSSI; + dm_digtable->pre_cck_fa_state = 0; + dm_digtable->cur_cck_fa_state = 0; } EXPORT_SYMBOL(rtl_dm_diginit); diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index 42a6fba..902b944 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -151,8 +151,14 @@ static u8 rtl8723e_dm_initial_gain_min_pwdb(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); struct dig_t *dm_digtable = >dm_digtable; + struct rtl_mac *mac = rtl_mac(rtlpriv); long rssi_val_min = 0; + if (mac->link_state == MAC80211_LINKED && + mac->opmode == NL80211_IFTYPE_STATION && + rtlpriv->link_info.bcn_rx_inperiod == 0) + return 0; + if ((dm_digtable->curmultista_cstate == DIG_MULTISTA_CONNECT) && (dm_digtable->cursta_cstate == DIG_STA_CONNECT)) { if (rtlpriv->dm.entry_min_undec_sm_pwdb != 0) @@ -417,6 +423,8 @@ static void rtl8723e_dm_cck_packet_detection_thresh(struct ieee80211_hw *hw) } else { rtl_set_bbreg(hw, RCCK0_CCA, MASKBYTE2, 0xcd); rtl_set_bbreg(hw, RCCK0_SYSTEM, MASKBYTE1, 0x47); + dm_digtable->pre_cck_fa_state = 0; + dm_digtable->cur_cck_fa_state = 0; } dm_digtable->pre_cck_pd_state = dm_digtable->cur_cck_pd_state; -- 1.9.1
[PATCH v2 1/4] rtlwifi: rtl8723ae: Take the FW LPS mode handling out
This appears to trigger a firmware bug and causes severe problems with rtl8723ae PCI devices. When the power save mode is activated for longer periods of time the firmware stops to receive any packets. This problem was exposed by commit 873ffe154ae0 ("rtlwifi: Fix logic error in enter/exit power-save mode"). Previously the power save mode was only active rarely and only for a short time so that the problem was not noticeable. Signed-off-by: Bernd Edlinger --- drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c index 07b82700d1..3103151 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c @@ -266,8 +266,8 @@ static bool is_fw_header(struct rtlwifi_firmware_header *hdr) static struct rtl_mod_params rtl8723e_mod_params = { .sw_crypto = false, .inactiveps = true, - .swctrl_lps = false, - .fwctrl_lps = true, + .swctrl_lps = true, + .fwctrl_lps = false, .aspm_support = 1, .debug_level = 0, .debug_mask = 0, @@ -395,8 +395,8 @@ static bool is_fw_header(struct rtlwifi_firmware_header *hdr) bool, 0444); MODULE_PARM_DESC(swenc, "Set to 1 for software crypto (default 0)\n"); MODULE_PARM_DESC(ips, "Set to 0 to not use link power save (default 1)\n"); -MODULE_PARM_DESC(swlps, "Set to 1 to use SW control power save (default 0)\n"); -MODULE_PARM_DESC(fwlps, "Set to 1 to use FW control power save (default 1)\n"); +MODULE_PARM_DESC(swlps, "Set to 1 to use SW control power save (default 1)\n"); +MODULE_PARM_DESC(fwlps, "Set to 1 to use FW control power save (default 0)\n"); MODULE_PARM_DESC(msi, "Set to 1 to use MSI interrupts mode (default 0)\n"); MODULE_PARM_DESC(aspm, "Set to 1 to enable ASPM (default 1)\n"); MODULE_PARM_DESC(debug_level, "Set debug level (0-5) (default 0)"); -- 1.9.1
[PATCH v2 0/4] rtlwifi: Fix issues with rtl8723ae
Currently the rtl8723ae driver is broken (since v4.7). Connection to AP is lost very often, especially when the signal level is not very good. The main issue is the power save mode is basically not working, and seems to trigger a firmware bug. So I had to take out the FW LPS mode handling. While debugging the driver I found a couple related issues, for instance that the signal level in dm.undec_sm_pwdb is no longer accurate (may be even much too high) when no more packets are received, and it increases the likelihood to receive something if the input gain is set to maximum. The patch was tested with the rtl8723ae PCI card in my laptop against a FRITZ!Box 7590 AP -- the WiFi connection works now very reliable for me. V2 of the patch adjusts the defaults of swlps and fwlps module parameters to match the firmware capabilities instead of removing the whole code, so it can be easily re-activated once a firmware update is available. Bernd Edlinger (4): rtlwifi: rtl8723ae: Take the FW LPS mode handling out rtlwifi: rtl8723ae: Don't use dm.undec_sm_pwdb for input gain control when no beacon was received in the connected state rtlwifi: rtl8723ae: Re-introduce rtl8723e_dm_refresh_rate_adaptive_mask rtlwifi: Move the clearing of rtlpriv->link_info.num_rx_inperiod in rtl_watchdog_wq_callback a few lines down drivers/net/wireless/realtek/rtlwifi/base.c| 4 +- drivers/net/wireless/realtek/rtlwifi/core.c| 2 + .../net/wireless/realtek/rtlwifi/rtl8723ae/dm.c| 95 +- .../net/wireless/realtek/rtlwifi/rtl8723ae/sw.c| 8 +- 4 files changed, 101 insertions(+), 8 deletions(-) -- 1.9.1
Re: [PATCH 1/4] rtlwifi: rtl8723ae: Take the FW LPS mode handling out
On 1/5/19 5:13 PM, Larry Finger wrote: >> but this works: >> >> modprobe rtl8723ae debug_mask=0x debug_level=5 swlps=1 fwlps=0 > > Yes, I think that is a better thing to do now. If and when Realtek finds a > firmware bug, and when the new firmware is readily available, then there will > not be a lot of code to reinstall. > Okay, my assumption of how the firmware bug "works" is this: Once the firmware enters power save mode, it will deliver exactly one (or maybe two) Rx interrupts. If one of those triggers the driver to leave the power save mode again, the firmware continues to work. If those are only beacons, they won't leave the power save mode. Then the firmware will usually not recover. Since prior to commit 873ffe154ae0 ("rtlwifi: Fix logic error in enter/exit power-save mode") the power save mode was only activated when there _is_ busy traffic, the next packet did usually wake the firmware, rarely it did freeze however. Other things like changing the cck_packet_detection_threshold or refresh_rate_adaptive_mask can also kick the firmware back to life. Hope this helps to track down the root cause of this bug. Thanks Bernd.
Re: [PATCH 1/4] rtlwifi: rtl8723ae: Take the FW LPS mode handling out
On 1/5/19 3:44 AM, Larry Finger wrote: > On 1/4/19 6:48 AM, Bernd Edlinger wrote: >> This appears to trigger a firmware bug and causes severe >> problems with rtl8723ae PCI devices. >> >> When the power save mode is activated for longer periods >> of time the firmware stops to receive any packets. >> >> This problem was exposed by commit 873ffe154ae0 ("rtlwifi: >> Fix logic error in enter/exit power-save mode"). >> >> Previously the power save mode was only active rarely and >> only for a short time so that the problem was not noticeable. >> >> Signed-off-by: Bernd Edlinger >> --- > > While the Realtek firmware group has a chance to look for a bug, I would like > you to perform a couple of tests on the original code. > > The driver has three module parameters that affect power save. The 'modinfo > rtl8723ae' command lists them as > > parm: ips:Set to 0 to not use link power save (default 1) (bool) > parm: swlps:Set to 1 to use SW control power save (default 0) (bool) > parm: fwlps:Set to 1 to use FW control power save (default 1) (bool) > > If you were to load rtl8723ae with 'ips=0', does it still fail? > If you were to load the driver with 'swlps=1 fwlps=0', does it still fail? > this does not work: modprobe rtl8723ae debug_mask=0x debug_level=5 ips=0 tail -f /var/log/syslog|grep "AP off" Jan 5 11:42:06 w-ed kernel: [ 7267.229713] rtlwifi: :<0> AP off for 2 s Jan 5 11:42:08 w-ed kernel: [ 7269.276761] rtlwifi: :<0> AP off for 4 s Jan 5 11:42:10 w-ed kernel: [ 7271.323758] rtlwifi: :<0> AP off for 6 s Jan 5 11:42:12 w-ed kernel: [ 7273.370759] rtlwifi: :<0> AP off for 8 s Jan 5 11:42:14 w-ed kernel: [ 7275.417753] rtlwifi: :<0> AP off for 10 s Jan 5 11:42:14 w-ed kernel: [ 7275.417754] rtlwifi: AP off, try to reconnect now Jan 5 11:42:28 w-ed kernel: [ 7289.746676] rtlwifi: :<0> AP off for 2 s Jan 5 11:42:40 w-ed kernel: [ 7302.028327] rtlwifi: :<0> AP off for 2 s Jan 5 11:42:43 w-ed kernel: [ 7304.075327] rtlwifi: :<0> AP off for 4 s Jan 5 11:42:45 w-ed kernel: [ 7306.122330] rtlwifi: :<0> AP off for 6 s Jan 5 11:42:47 w-ed kernel: [ 7308.169292] rtlwifi: :<0> AP off for 8 s Jan 5 11:42:49 w-ed kernel: [ 7310.216236] rtlwifi: :<0> AP off for 10 s Jan 5 11:42:49 w-ed kernel: [ 7310.216238] rtlwifi: AP off, try to reconnect now Jan 5 11:43:05 w-ed kernel: [ 7326.59] rtlwifi: :<0> AP off for 2 s Jan 5 11:43:07 w-ed kernel: [ 7328.639076] rtlwifi: :<0> AP off for 4 s Jan 5 11:43:09 w-ed kernel: [ 7330.686220] rtlwifi: :<0> AP off for 6 s Jan 5 11:43:11 w-ed kernel: [ 7332.733078] rtlwifi: :<0> AP off for 8 s Jan 5 11:43:13 w-ed kernel: [ 7334.779988] rtlwifi: :<0> AP off for 10 s Jan 5 11:43:13 w-ed kernel: [ 7334.779989] rtlwifi: AP off, try to reconnect now Jan 5 11:43:28 w-ed kernel: [ 7349.108839] rtlwifi: :<0> AP off for 2 s Jan 5 11:43:30 w-ed kernel: [ 7351.155837] rtlwifi: :<0> AP off for 4 s Jan 5 11:43:32 w-ed kernel: [ 7353.202838] rtlwifi: :<0> AP off for 6 s Jan 5 11:43:42 w-ed kernel: [ 7363.437779] rtlwifi: :<0> AP off for 2 s Jan 5 11:43:46 w-ed kernel: [ 7367.531622] rtlwifi: :<0> AP off for 2 s Jan 5 11:43:48 w-ed kernel: [ 7369.578597] rtlwifi: :<0> AP off for 4 s Jan 5 11:43:50 w-ed kernel: [ 7371.625694] rtlwifi: :<0> AP off for 6 s Jan 5 11:43:52 w-ed kernel: [ 7373.672691] rtlwifi: :<0> AP off for 8 s Jan 5 11:43:54 w-ed kernel: [ 7375.719690] rtlwifi: :<0> AP off for 10 s Jan 5 11:43:54 w-ed kernel: [ 7375.719691] rtlwifi: AP off, try to reconnect now Jan 5 11:44:09 w-ed kernel: [ 7390.048406] rtlwifi: :<0> AP off for 2 s Jan 5 11:44:11 w-ed kernel: [ 7392.095678] rtlwifi: :<0> AP off for 4 s Jan 5 11:44:13 w-ed kernel: [ 7394.142349] rtlwifi: :<0> AP off for 6 s Jan 5 11:44:15 w-ed kernel: [ 7396.189352] rtlwifi: :<0> AP off for 8 s Jan 5 11:44:17 w-ed kernel: [ 7398.236352] rtlwifi: :<0> AP off for 10 s Jan 5 11:44:17 w-ed kernel: [ 7398.236353] rtlwifi: AP off, try to reconnect now Jan 5 11:44:31 w-ed kernel: [ 7412.565079] rtlwifi: :<0> AP off for 2 s Jan 5 11:44:33 w-ed kernel: [ 7414.612167] rtlwifi: :<0> AP off for 4 s Jan 5 11:44:35 w-ed kernel: [ 7416.659101] rtlwifi: :<0> AP off for 6 s Jan 5 11:44:37 w-ed kernel: [ 7418.706035] rtlwifi: :<0> AP off for 8 s Jan 5 11:44:39 w-ed kernel: [ 7420.753100] rtlwifi: :<0> AP off for 10 s Jan 5 11:44:39 w-ed kernel: [ 7420.753101] rtlwifi: AP off, try to reconnect now Jan 5 11:44:54 w-ed kernel: [ 7435.081860] rtlwifi: :<0> AP off for 2 s Jan 5 11:44:56 w-ed kernel: [ 7437.128857] rtlwifi: :<0> AP off for 4 s Jan 5 11:45:08 w-ed kernel: [ 7449.410653] rtlwifi: :<0> AP off for 2
[PATCH 1/4] rtlwifi: rtl8723ae: Take the FW LPS mode handling out
This appears to trigger a firmware bug and causes severe problems with rtl8723ae PCI devices. When the power save mode is activated for longer periods of time the firmware stops to receive any packets. This problem was exposed by commit 873ffe154ae0 ("rtlwifi: Fix logic error in enter/exit power-save mode"). Previously the power save mode was only active rarely and only for a short time so that the problem was not noticeable. Signed-off-by: Bernd Edlinger --- .../net/wireless/realtek/rtlwifi/rtl8723ae/fw.c| 20 .../net/wireless/realtek/rtlwifi/rtl8723ae/fw.h| 1 - .../net/wireless/realtek/rtlwifi/rtl8723ae/hw.c| 56 +- 3 files changed, 1 insertion(+), 76 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/fw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/fw.c index bf9859f..77833fb 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/fw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/fw.c @@ -268,26 +268,6 @@ void rtl8723e_fill_h2c_cmd(struct ieee80211_hw *hw, (u8 *)_cmdbuf); } -void rtl8723e_set_fw_pwrmode_cmd(struct ieee80211_hw *hw, u8 mode) -{ - struct rtl_priv *rtlpriv = rtl_priv(hw); - u8 u1_h2c_set_pwrmode[3] = { 0 }; - struct rtl_ps_ctl *ppsc = rtl_psc(rtl_priv(hw)); - - RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD, "FW LPS mode = %d\n", mode); - - SET_H2CCMD_PWRMODE_PARM_MODE(u1_h2c_set_pwrmode, mode); - SET_H2CCMD_PWRMODE_PARM_SMART_PS(u1_h2c_set_pwrmode, - (rtlpriv->mac80211.p2p) ? ppsc->smart_ps : 1); - SET_H2CCMD_PWRMODE_PARM_BCN_PASS_TIME(u1_h2c_set_pwrmode, - ppsc->reg_max_lps_awakeintvl); - - RT_PRINT_DATA(rtlpriv, COMP_CMD, DBG_DMESG, - "rtl8723e_set_fw_rsvdpagepkt(): u1_h2c_set_pwrmode\n", - u1_h2c_set_pwrmode, 3); - rtl8723e_fill_h2c_cmd(hw, H2C_SETPWRMODE, 3, u1_h2c_set_pwrmode); -} - #define BEACON_PG 0 /* ->1 */ #define PSPOLL_PG 2 #define NULL_PG 3 diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/fw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/fw.h index 2e668fc..8618b82 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/fw.h +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/fw.h @@ -55,7 +55,6 @@ void rtl8723e_fill_h2c_cmd(struct ieee80211_hw *hw, u8 element_id, u32 cmd_len, u8 *p_cmdbuffer); -void rtl8723e_set_fw_pwrmode_cmd(struct ieee80211_hw *hw, u8 mode); void rtl8723e_set_fw_rsvdpagepkt(struct ieee80211_hw *hw, bool b_dl_finished); void rtl8723e_set_fw_joinbss_report_cmd(struct ieee80211_hw *hw, u8 mstatus); void rtl8723e_set_p2p_ps_offload_cmd(struct ieee80211_hw *hw, u8 p2p_ps_state); diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/hw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/hw.c index f783e4a..f0eb356 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/hw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/hw.c @@ -408,29 +408,7 @@ void rtl8723e_set_hw_reg(struct ieee80211_hw *hw, u8 variable, u8 *val) case HW_VAR_WPA_CONFIG: rtl_write_byte(rtlpriv, REG_SECCFG, *((u8 *)val)); break; - case HW_VAR_SET_RPWM:{ - u8 rpwm_val; - - rpwm_val = rtl_read_byte(rtlpriv, REG_PCIE_HRPWM); - udelay(1); - - if (rpwm_val & BIT(7)) { - rtl_write_byte(rtlpriv, REG_PCIE_HRPWM, - (*(u8 *)val)); - } else { - rtl_write_byte(rtlpriv, REG_PCIE_HRPWM, - ((*(u8 *)val) | BIT(7))); - } - - break; - } case HW_VAR_H2C_FW_PWRMODE:{ - u8 psmode = (*(u8 *)val); - - if (psmode != FW_PS_ACTIVE_MODE) - rtl8723e_dm_rf_saving(hw, true); - - rtl8723e_set_fw_pwrmode_cmd(hw, (*(u8 *)val)); break; } case HW_VAR_FW_PSMODE_STATUS: @@ -513,39 +491,7 @@ void rtl8723e_set_hw_reg(struct ieee80211_hw *hw, u8 variable, u8 *val) break; } case HW_VAR_FW_LPS_ACTION:{ - bool b_enter_fwlps = *((bool *)val); - u8 rpwm_val, fw_pwrmode; - bool fw_current_inps; - - if (b_enter_fwlps) { - rpwm_val = 0x02;/* RF off */ - fw_current_inps = true; - rtlpriv->cfg->ops->set_hw_reg(hw, -
[PATCH 2/4] rtlwifi: rtl8723ae: Don't use dm.undec_sm_pwdb for input
gain control when no beacon was received in the connected state. This avoids sporadic "Connection to AP ... lost" errors. Signed-off-by: Bernd Edlinger --- drivers/net/wireless/realtek/rtlwifi/core.c | 2 ++ drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c | 8 2 files changed, 10 insertions(+) diff --git a/drivers/net/wireless/realtek/rtlwifi/core.c b/drivers/net/wireless/realtek/rtlwifi/core.c index 4bf7967..ce23339 100644 --- a/drivers/net/wireless/realtek/rtlwifi/core.c +++ b/drivers/net/wireless/realtek/rtlwifi/core.c @@ -1957,5 +1957,7 @@ void rtl_dm_diginit(struct ieee80211_hw *hw, u32 cur_igvalue) dm_digtable->bt30_cur_igi = 0x32; dm_digtable->pre_cck_pd_state = CCK_PD_STAGE_MAX; dm_digtable->cur_cck_pd_state = CCK_PD_STAGE_LOWRSSI; + dm_digtable->pre_cck_fa_state = 0; + dm_digtable->cur_cck_fa_state = 0; } EXPORT_SYMBOL(rtl_dm_diginit); diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index 42a6fba..902b944 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -151,8 +151,14 @@ static u8 rtl8723e_dm_initial_gain_min_pwdb(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); struct dig_t *dm_digtable = >dm_digtable; + struct rtl_mac *mac = rtl_mac(rtlpriv); long rssi_val_min = 0; + if (mac->link_state == MAC80211_LINKED && + mac->opmode == NL80211_IFTYPE_STATION && + rtlpriv->link_info.bcn_rx_inperiod == 0) + return 0; + if ((dm_digtable->curmultista_cstate == DIG_MULTISTA_CONNECT) && (dm_digtable->cursta_cstate == DIG_STA_CONNECT)) { if (rtlpriv->dm.entry_min_undec_sm_pwdb != 0) @@ -417,6 +423,8 @@ static void rtl8723e_dm_cck_packet_detection_thresh(struct ieee80211_hw *hw) } else { rtl_set_bbreg(hw, RCCK0_CCA, MASKBYTE2, 0xcd); rtl_set_bbreg(hw, RCCK0_SYSTEM, MASKBYTE1, 0x47); + dm_digtable->pre_cck_fa_state = 0; + dm_digtable->cur_cck_fa_state = 0; } dm_digtable->pre_cck_pd_state = dm_digtable->cur_cck_pd_state; -- 1.9.1
[PATCH 4/4] rtlwifi: Move the clearing of
rtlpriv->link_info.num_rx_inperiod in rtl_watchdog_wq_callback a few lines down. This is necessary since it is still used in the "AP off" detection code block. Moved clearing of rtlpriv->link_info.num_rx_inperiod as well for consistency. Signed-off-by: Bernd Edlinger --- drivers/net/wireless/realtek/rtlwifi/base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/base.c b/drivers/net/wireless/realtek/rtlwifi/base.c index ef9b502..7aa68fe 100644 --- a/drivers/net/wireless/realtek/rtlwifi/base.c +++ b/drivers/net/wireless/realtek/rtlwifi/base.c @@ -2172,8 +2172,6 @@ void rtl_watchdog_wq_callback(void *data) ; } - rtlpriv->link_info.num_rx_inperiod = 0; - rtlpriv->link_info.num_tx_inperiod = 0; for (tid = 0; tid <= 7; tid++) rtlpriv->link_info.tidtx_inperiod[tid] = 0; @@ -2236,6 +2234,8 @@ void rtl_watchdog_wq_callback(void *data) rtlpriv->btcoexist.btc_info.in_4way = false; } + rtlpriv->link_info.num_rx_inperiod = 0; + rtlpriv->link_info.num_tx_inperiod = 0; rtlpriv->link_info.bcn_rx_inperiod = 0; /* <6> scan list */ -- 1.9.1
[PATCH 0/4] rtlwifi: Fix issues with rtl8723ae
Currently the rtl8723ae driver is broken (since v4.7). Connection to AP is lost very often, especially when the signal level is not very good. The main issue is the power save mode is basically not working, and seems to trigger a firmware bug. So I had to remove that code. While debugging the driver I found a couple related issues, for instance that the signal level in dm.undec_sm_pwdb is no longer accurate (may be even much too high) when no more packets are received, and it increases the likelihood to receive something if the input gain is set to maximum. The patch was tested with the rtl8723ae PCI card in my laptop against a FRITZ!Box 7590 AP -- the WiFi connection works now very reliable for me. Bernd Edlinger (4): rtlwifi: rtl8723ae: Take the FW LPS mode handling out rtlwifi: rtl8723ae: Don't use dm.undec_sm_pwdb for input gain control when no beacon was received in the connected state. rtlwifi: rtl8723ae: Re-introduce rtl8723e_dm_refresh_rate_adaptive_mask rtlwifi: Move the clearing of rtlpriv->link_info.num_rx_inperiod in rtl_watchdog_wq_callback a few lines down. drivers/net/wireless/realtek/rtlwifi/base.c| 4 +- drivers/net/wireless/realtek/rtlwifi/core.c| 2 + .../net/wireless/realtek/rtlwifi/rtl8723ae/dm.c| 95 +- .../net/wireless/realtek/rtlwifi/rtl8723ae/fw.c| 20 - .../net/wireless/realtek/rtlwifi/rtl8723ae/fw.h| 1 - .../net/wireless/realtek/rtlwifi/rtl8723ae/hw.c| 56 + 6 files changed, 98 insertions(+), 80 deletions(-) -- 1.9.1
[PATCH 3/4] rtlwifi: rtl8723ae: Re-introduce
rtl8723e_dm_refresh_rate_adaptive_mask This function was present in a previous version of the code base, it works just fine for me -- as long as it is not using stale data. Fixed a style nit in rtl8723e_dm_init_rate_adaptive_mask. Signed-off-by: Bernd Edlinger --- .../net/wireless/realtek/rtlwifi/rtl8723ae/dm.c| 87 +- 1 file changed, 85 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index 902b944..acfd54c 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -673,7 +673,7 @@ void rtl8723e_dm_check_txpower_tracking(struct ieee80211_hw *hw) void rtl8723e_dm_init_rate_adaptive_mask(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); - struct rate_adaptive *p_ra = &(rtlpriv->ra); + struct rate_adaptive *p_ra = >ra; p_ra->ratr_state = DM_RATR_STA_INIT; p_ra->pre_ratr_state = DM_RATR_STA_INIT; @@ -685,6 +685,89 @@ void rtl8723e_dm_init_rate_adaptive_mask(struct ieee80211_hw *hw) } +void rtl8723e_dm_refresh_rate_adaptive_mask(struct ieee80211_hw *hw) +{ + struct rtl_priv *rtlpriv = rtl_priv(hw); + struct rtl_hal *rtlhal = rtl_hal(rtl_priv(hw)); + struct rtl_mac *mac = rtl_mac(rtl_priv(hw)); + struct rate_adaptive *p_ra = >ra; + u32 low_rssithresh_for_ra, high_rssithresh_for_ra; + struct ieee80211_sta *sta = NULL; + + if (is_hal_stop(rtlhal)) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +" driver is going to unload\n"); + return; + } + + if (!rtlpriv->dm.useramask) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +" driver does not control rate adaptive mask\n"); + return; + } + + if (mac->link_state == MAC80211_LINKED && + mac->opmode == NL80211_IFTYPE_STATION) { + switch (p_ra->pre_ratr_state) { + case DM_RATR_STA_HIGH: + high_rssithresh_for_ra = 50; + low_rssithresh_for_ra = 20; + break; + case DM_RATR_STA_MIDDLE: + high_rssithresh_for_ra = 55; + low_rssithresh_for_ra = 20; + break; + case DM_RATR_STA_LOW: + high_rssithresh_for_ra = 60; + low_rssithresh_for_ra = 25; + break; + default: + high_rssithresh_for_ra = 50; + low_rssithresh_for_ra = 20; + break; + } + + if (rtlpriv->link_info.bcn_rx_inperiod == 0) + switch (p_ra->pre_ratr_state) { + case DM_RATR_STA_HIGH: + default: + p_ra->ratr_state = DM_RATR_STA_MIDDLE; + break; + case DM_RATR_STA_MIDDLE: + case DM_RATR_STA_LOW: + p_ra->ratr_state = DM_RATR_STA_LOW; + break; + } + else if (rtlpriv->dm.undec_sm_pwdb > high_rssithresh_for_ra) + p_ra->ratr_state = DM_RATR_STA_HIGH; + else if (rtlpriv->dm.undec_sm_pwdb > low_rssithresh_for_ra) + p_ra->ratr_state = DM_RATR_STA_MIDDLE; + else + p_ra->ratr_state = DM_RATR_STA_LOW; + + if (p_ra->pre_ratr_state != p_ra->ratr_state) { + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"RSSI = %ld\n", +rtlpriv->dm.undec_sm_pwdb); + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"RSSI_LEVEL = %d\n", p_ra->ratr_state); + RT_TRACE(rtlpriv, COMP_RATE, DBG_LOUD, +"PreState = %d, CurState = %d\n", +p_ra->pre_ratr_state, p_ra->ratr_state); + + rcu_read_lock(); + sta = rtl_find_sta(hw, mac->bssid); + if (sta) + rtlpriv->cfg->ops->update_rate_tbl(hw, sta, + p_ra->ratr_state, + true); + rcu_read_unlock(); + + p_ra->pre_ratr_state = p_ra->ratr_state; + } + } +} + void rtl8723e_dm_rf_saving(struct
Re: [PATCHv3] Fix range checks in kernfs_get_target_path
Ping... Sorry, I had actually completely forgotten about this one. On 07/07/18 19:52, Bernd Edlinger wrote: > The terminating NUL byte is only there because the buffer is > allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the > range-check is off-by-one, and PAGE_SIZE==PATH_MAX, the > returned string may not be zero-terminated if it is exactly > PATH_MAX characters long. Furthermore also the initial loop > may theoretically exceed PATH_MAX and cause a fault. > > Signed-off-by: Bernd Edlinger > --- > fs/kernfs/symlink.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c > index 08ccabd..774bada 100644 > --- a/fs/kernfs/symlink.c > +++ b/fs/kernfs/symlink.c > @@ -63,6 +63,9 @@ static int kernfs_get_target_path(struct kernfs_node > if (base == kn) > break; > > + if ((s - path) + 3 >= PATH_MAX) > + return -ENAMETOOLONG; > + > strcpy(s, "../"); > s += 3; > base = base->parent; > @@ -79,7 +82,7 @@ static int kernfs_get_target_path(struct kernfs_node > if (len < 2) > return -EINVAL; > len--; > - if ((s - path) + len > PATH_MAX) > + if ((s - path) + len >= PATH_MAX) > return -ENAMETOOLONG; > > /* reverse fillup of target string from target to base */
Re: [PATCHv3] Fix range checks in kernfs_get_target_path
Ping... Sorry, I had actually completely forgotten about this one. On 07/07/18 19:52, Bernd Edlinger wrote: > The terminating NUL byte is only there because the buffer is > allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the > range-check is off-by-one, and PAGE_SIZE==PATH_MAX, the > returned string may not be zero-terminated if it is exactly > PATH_MAX characters long. Furthermore also the initial loop > may theoretically exceed PATH_MAX and cause a fault. > > Signed-off-by: Bernd Edlinger > --- > fs/kernfs/symlink.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c > index 08ccabd..774bada 100644 > --- a/fs/kernfs/symlink.c > +++ b/fs/kernfs/symlink.c > @@ -63,6 +63,9 @@ static int kernfs_get_target_path(struct kernfs_node > if (base == kn) > break; > > + if ((s - path) + 3 >= PATH_MAX) > + return -ENAMETOOLONG; > + > strcpy(s, "../"); > s += 3; > base = base->parent; > @@ -79,7 +82,7 @@ static int kernfs_get_target_path(struct kernfs_node > if (len < 2) > return -EINVAL; > len--; > - if ((s - path) + len > PATH_MAX) > + if ((s - path) + len >= PATH_MAX) > return -ENAMETOOLONG; > > /* reverse fillup of target string from target to base */
Re: [PATCH] Fix the oldconfig target rule
On 07/08/18 12:41, Masahiro Yamada wrote: > 2018-07-08 17:09 GMT+09:00 Bernd Edlinger : >> On 07/08/18 08:31, Bernd Edlinger wrote: >>> The next make after an oldconfig reads in the outdated >>> include/config/auto.conf which can kill the make before >>> it is able to call the syncconfig target. >>> >>> $ make defconfig >>> *** Default configuration is based on 'x86_64_defconfig' >>> $ make >>> scripts/kconfig/conf --syncconfig Kconfig >>> Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, >>> please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. >>> >>> $ sed -i s/CONFIG_UNWINDER_ORC=y// .configs >>> $ make oldconfig >>> Choose kernel unwinder >>>> 1. ORC unwinder (UNWINDER_ORC) (NEW) >>> 2. Frame pointer unwinder (UNWINDER_FRAME_POINTER) >>> choice[1-2?]: 2 >>> $ make >>> Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, >>> please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. >>> >>> With this patch oldconfig and similar targets like >>> menuconfig call syncconfig to avoid this failure. >>> >>> Signed-off-by: Bernd Edlinger >>> --- >>>scripts/kconfig/Makefile | 1 + >>>1 file changed, 1 insertion(+) >>> >>> diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile >>> index a3ac2c9..ef34275 100644 >>> --- a/scripts/kconfig/Makefile >>> +++ b/scripts/kconfig/Makefile >>> @@ -62,6 +62,7 @@ PHONY += $(simple-targets) >>> >>>$(simple-targets): $(obj)/conf >>>$< $(silent) --$@ $(Kconfig) >>> +$(Q)$(MAKE) -f $(srctree)/Makefile syncconfig >> >> On second thought, removing the auto.conf might be sufficient: >> $(Q)rm -f include/config/auto.conf include/config/auto.conf.cmd >> >> Would you prefer that? > > No. > > My suggestion is this: > https://patchwork.kernel.org/patch/10513065/ > > Ok, thanks a lot. I tried your patch and it seems reasonable me. (nit: In the change log entry s/RETPORINE/RETPOLINE/) However the warning is gone as well, which was probably done on purpose to push people to install the elf library. I wonder in general if the test using HOSTCC misses somehow the possibility that there may be a CROSS_COMPLILE ? Bernd.
Re: [PATCH] Fix the oldconfig target rule
On 07/08/18 12:41, Masahiro Yamada wrote: > 2018-07-08 17:09 GMT+09:00 Bernd Edlinger : >> On 07/08/18 08:31, Bernd Edlinger wrote: >>> The next make after an oldconfig reads in the outdated >>> include/config/auto.conf which can kill the make before >>> it is able to call the syncconfig target. >>> >>> $ make defconfig >>> *** Default configuration is based on 'x86_64_defconfig' >>> $ make >>> scripts/kconfig/conf --syncconfig Kconfig >>> Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, >>> please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. >>> >>> $ sed -i s/CONFIG_UNWINDER_ORC=y// .configs >>> $ make oldconfig >>> Choose kernel unwinder >>>> 1. ORC unwinder (UNWINDER_ORC) (NEW) >>> 2. Frame pointer unwinder (UNWINDER_FRAME_POINTER) >>> choice[1-2?]: 2 >>> $ make >>> Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, >>> please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. >>> >>> With this patch oldconfig and similar targets like >>> menuconfig call syncconfig to avoid this failure. >>> >>> Signed-off-by: Bernd Edlinger >>> --- >>>scripts/kconfig/Makefile | 1 + >>>1 file changed, 1 insertion(+) >>> >>> diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile >>> index a3ac2c9..ef34275 100644 >>> --- a/scripts/kconfig/Makefile >>> +++ b/scripts/kconfig/Makefile >>> @@ -62,6 +62,7 @@ PHONY += $(simple-targets) >>> >>>$(simple-targets): $(obj)/conf >>>$< $(silent) --$@ $(Kconfig) >>> +$(Q)$(MAKE) -f $(srctree)/Makefile syncconfig >> >> On second thought, removing the auto.conf might be sufficient: >> $(Q)rm -f include/config/auto.conf include/config/auto.conf.cmd >> >> Would you prefer that? > > No. > > My suggestion is this: > https://patchwork.kernel.org/patch/10513065/ > > Ok, thanks a lot. I tried your patch and it seems reasonable me. (nit: In the change log entry s/RETPORINE/RETPOLINE/) However the warning is gone as well, which was probably done on purpose to push people to install the elf library. I wonder in general if the test using HOSTCC misses somehow the possibility that there may be a CROSS_COMPLILE ? Bernd.
Re: [PATCH] Fix the oldconfig target rule
On 07/08/18 08:31, Bernd Edlinger wrote: > The next make after an oldconfig reads in the outdated > include/config/auto.conf which can kill the make before > it is able to call the syncconfig target. > > $ make defconfig > *** Default configuration is based on 'x86_64_defconfig' > $ make > scripts/kconfig/conf --syncconfig Kconfig > Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, > please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. > > $ sed -i s/CONFIG_UNWINDER_ORC=y// .configs > $ make oldconfig > Choose kernel unwinder >> 1. ORC unwinder (UNWINDER_ORC) (NEW) > 2. Frame pointer unwinder (UNWINDER_FRAME_POINTER) > choice[1-2?]: 2 > $ make > Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, > please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. > > With this patch oldconfig and similar targets like > menuconfig call syncconfig to avoid this failure. > > Signed-off-by: Bernd Edlinger > --- > scripts/kconfig/Makefile | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile > index a3ac2c9..ef34275 100644 > --- a/scripts/kconfig/Makefile > +++ b/scripts/kconfig/Makefile > @@ -62,6 +62,7 @@ PHONY += $(simple-targets) > > $(simple-targets): $(obj)/conf > $< $(silent) --$@ $(Kconfig) > + $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig On second thought, removing the auto.conf might be sufficient: $(Q)rm -f include/config/auto.conf include/config/auto.conf.cmd Would you prefer that? > > PHONY += oldnoconfig silentoldconfig savedefconfig defconfig >
Re: [PATCH] Fix the oldconfig target rule
On 07/08/18 08:31, Bernd Edlinger wrote: > The next make after an oldconfig reads in the outdated > include/config/auto.conf which can kill the make before > it is able to call the syncconfig target. > > $ make defconfig > *** Default configuration is based on 'x86_64_defconfig' > $ make > scripts/kconfig/conf --syncconfig Kconfig > Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, > please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. > > $ sed -i s/CONFIG_UNWINDER_ORC=y// .configs > $ make oldconfig > Choose kernel unwinder >> 1. ORC unwinder (UNWINDER_ORC) (NEW) > 2. Frame pointer unwinder (UNWINDER_FRAME_POINTER) > choice[1-2?]: 2 > $ make > Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, > please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. > > With this patch oldconfig and similar targets like > menuconfig call syncconfig to avoid this failure. > > Signed-off-by: Bernd Edlinger > --- > scripts/kconfig/Makefile | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile > index a3ac2c9..ef34275 100644 > --- a/scripts/kconfig/Makefile > +++ b/scripts/kconfig/Makefile > @@ -62,6 +62,7 @@ PHONY += $(simple-targets) > > $(simple-targets): $(obj)/conf > $< $(silent) --$@ $(Kconfig) > + $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig On second thought, removing the auto.conf might be sufficient: $(Q)rm -f include/config/auto.conf include/config/auto.conf.cmd Would you prefer that? > > PHONY += oldnoconfig silentoldconfig savedefconfig defconfig >
[PATCH] Fix the oldconfig target rule
The next make after an oldconfig reads in the outdated include/config/auto.conf which can kill the make before it is able to call the syncconfig target. $ make defconfig *** Default configuration is based on 'x86_64_defconfig' $ make scripts/kconfig/conf --syncconfig Kconfig Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. $ sed -i s/CONFIG_UNWINDER_ORC=y// .configs $ make oldconfig Choose kernel unwinder > 1. ORC unwinder (UNWINDER_ORC) (NEW) 2. Frame pointer unwinder (UNWINDER_FRAME_POINTER) choice[1-2?]: 2 $ make Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. With this patch oldconfig and similar targets like menuconfig call syncconfig to avoid this failure. Signed-off-by: Bernd Edlinger --- scripts/kconfig/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index a3ac2c9..ef34275 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -62,6 +62,7 @@ PHONY += $(simple-targets) $(simple-targets): $(obj)/conf $< $(silent) --$@ $(Kconfig) + $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig PHONY += oldnoconfig silentoldconfig savedefconfig defconfig -- 1.9.1
[PATCH] Fix the oldconfig target rule
The next make after an oldconfig reads in the outdated include/config/auto.conf which can kill the make before it is able to call the syncconfig target. $ make defconfig *** Default configuration is based on 'x86_64_defconfig' $ make scripts/kconfig/conf --syncconfig Kconfig Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. $ sed -i s/CONFIG_UNWINDER_ORC=y// .configs $ make oldconfig Choose kernel unwinder > 1. ORC unwinder (UNWINDER_ORC) (NEW) 2. Frame pointer unwinder (UNWINDER_FRAME_POINTER) choice[1-2?]: 2 $ make Makefile:936: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. With this patch oldconfig and similar targets like menuconfig call syncconfig to avoid this failure. Signed-off-by: Bernd Edlinger --- scripts/kconfig/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index a3ac2c9..ef34275 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -62,6 +62,7 @@ PHONY += $(simple-targets) $(simple-targets): $(obj)/conf $< $(silent) --$@ $(Kconfig) + $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig PHONY += oldnoconfig silentoldconfig savedefconfig defconfig -- 1.9.1
[PATCHv3] Fix range checks in kernfs_get_target_path
The terminating NUL byte is only there because the buffer is allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check is off-by-one, and PAGE_SIZE==PATH_MAX, the returned string may not be zero-terminated if it is exactly PATH_MAX characters long. Furthermore also the initial loop may theoretically exceed PATH_MAX and cause a fault. Signed-off-by: Bernd Edlinger --- fs/kernfs/symlink.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 08ccabd..774bada 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -63,6 +63,9 @@ static int kernfs_get_target_path(struct kernfs_node if (base == kn) break; + if ((s - path) + 3 >= PATH_MAX) + return -ENAMETOOLONG; + strcpy(s, "../"); s += 3; base = base->parent; @@ -79,7 +82,7 @@ static int kernfs_get_target_path(struct kernfs_node if (len < 2) return -EINVAL; len--; - if ((s - path) + len > PATH_MAX) + if ((s - path) + len >= PATH_MAX) return -ENAMETOOLONG; /* reverse fillup of target string from target to base */ -- 1.9.1
[PATCHv3] Fix range checks in kernfs_get_target_path
The terminating NUL byte is only there because the buffer is allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check is off-by-one, and PAGE_SIZE==PATH_MAX, the returned string may not be zero-terminated if it is exactly PATH_MAX characters long. Furthermore also the initial loop may theoretically exceed PATH_MAX and cause a fault. Signed-off-by: Bernd Edlinger --- fs/kernfs/symlink.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 08ccabd..774bada 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -63,6 +63,9 @@ static int kernfs_get_target_path(struct kernfs_node if (base == kn) break; + if ((s - path) + 3 >= PATH_MAX) + return -ENAMETOOLONG; + strcpy(s, "../"); s += 3; base = base->parent; @@ -79,7 +82,7 @@ static int kernfs_get_target_path(struct kernfs_node if (len < 2) return -EINVAL; len--; - if ((s - path) + len > PATH_MAX) + if ((s - path) + len >= PATH_MAX) return -ENAMETOOLONG; /* reverse fillup of target string from target to base */ -- 1.9.1
Re: [PATCHv2] Fix range checks in kernfs_get_target_path
On 07/07/18 16:01, Greg Kroah-Hartman wrote: > On Sat, Jul 07, 2018 at 09:41:03AM +0000, Bernd Edlinger wrote: >> The strncpy causes a warning [-Wstringop-truncation] here, >> which indicates that it never appends a NUL byte to the path. >> The NUL byte is only there because the buffer is allocated >> with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check >> is also off-by-one, and PAGE_SIZE==PATH_MAX the returned string >> will not be zero-terminated if it is exactly PATH_MAX characters. >> Furthermore also the initial loop may theoretically exceed PATH_MAX >> and cause a fault. >> >> Signed-off-by: Bernd Edlinger >> --- >>fs/kernfs/symlink.c | 10 +++--- >>1 file changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c >> index 08ccabd..c8b7d44a 100644 >> --- a/fs/kernfs/symlink.c >> +++ b/fs/kernfs/symlink.c >> @@ -63,7 +63,10 @@ static int kernfs_get_target_path(struct kernfs_node >> if (base == kn) >> break; >> >> -strcpy(s, "../"); >> +if ((s - path) + 3 >= PATH_MAX) >> +return -ENAMETOOLONG; >> + >> +memcpy(s, "../", 3); >> s += 3; >> base = base->parent; >> } >> @@ -79,16 +82,17 @@ static int kernfs_get_target_path(struct kernfs_node >> if (len < 2) >> return -EINVAL; >> len--; >> -if ((s - path) + len > PATH_MAX) >> +if ((s - path) + len >= PATH_MAX) >> return -ENAMETOOLONG; >> >> /* reverse fillup of target string from target to base */ >> kn = target; >> +s[len] = '\0'; >> while (kn->parent && kn != base) { >> int slen = strlen(kn->name); >> >> len -= slen; >> -strncpy(s + len, kn->name, slen); >> +memcpy(s + len, kn->name, slen); >> if (len) >> s[--len] = '/'; >> > > This last memcpy replacement has already been applied to my tree, from a > patch from soeone else, so are you sure all of the other changes are > also really needed? Why the extra \0 termination of a string that is > already terminated? > I did only a code review, but the range checks look really dangerously wrong. The string is only zero-terminated because it is allocated in kernfs_iop_get_link with body = kzalloc(PAGE_SIZE, GFP_KERNEL); I would recommend to explicitly place a termination in the buffer, and not rely on the way how the buffer is allocated. > And why is the first memcpy replacement needed? gcc doesn't say > anything about that, does it? > No, that is more or less for efficiency reasons, since it is writing the NUL which is always overwritten with something different in the next step. If the loop is executed zero times, there result is not explicitly zero-terminated either, so using strcpy is somehow misleading. Well, I would say personal taste, if the loop below constructs the string with memcpy the loop above can do the same. If you prefer the strcpy in the first loop, I have no strong preference here. Thanks Bernd. > thanks, > > greg k-h >
Re: [PATCHv2] Fix range checks in kernfs_get_target_path
On 07/07/18 16:01, Greg Kroah-Hartman wrote: > On Sat, Jul 07, 2018 at 09:41:03AM +0000, Bernd Edlinger wrote: >> The strncpy causes a warning [-Wstringop-truncation] here, >> which indicates that it never appends a NUL byte to the path. >> The NUL byte is only there because the buffer is allocated >> with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check >> is also off-by-one, and PAGE_SIZE==PATH_MAX the returned string >> will not be zero-terminated if it is exactly PATH_MAX characters. >> Furthermore also the initial loop may theoretically exceed PATH_MAX >> and cause a fault. >> >> Signed-off-by: Bernd Edlinger >> --- >>fs/kernfs/symlink.c | 10 +++--- >>1 file changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c >> index 08ccabd..c8b7d44a 100644 >> --- a/fs/kernfs/symlink.c >> +++ b/fs/kernfs/symlink.c >> @@ -63,7 +63,10 @@ static int kernfs_get_target_path(struct kernfs_node >> if (base == kn) >> break; >> >> -strcpy(s, "../"); >> +if ((s - path) + 3 >= PATH_MAX) >> +return -ENAMETOOLONG; >> + >> +memcpy(s, "../", 3); >> s += 3; >> base = base->parent; >> } >> @@ -79,16 +82,17 @@ static int kernfs_get_target_path(struct kernfs_node >> if (len < 2) >> return -EINVAL; >> len--; >> -if ((s - path) + len > PATH_MAX) >> +if ((s - path) + len >= PATH_MAX) >> return -ENAMETOOLONG; >> >> /* reverse fillup of target string from target to base */ >> kn = target; >> +s[len] = '\0'; >> while (kn->parent && kn != base) { >> int slen = strlen(kn->name); >> >> len -= slen; >> -strncpy(s + len, kn->name, slen); >> +memcpy(s + len, kn->name, slen); >> if (len) >> s[--len] = '/'; >> > > This last memcpy replacement has already been applied to my tree, from a > patch from soeone else, so are you sure all of the other changes are > also really needed? Why the extra \0 termination of a string that is > already terminated? > I did only a code review, but the range checks look really dangerously wrong. The string is only zero-terminated because it is allocated in kernfs_iop_get_link with body = kzalloc(PAGE_SIZE, GFP_KERNEL); I would recommend to explicitly place a termination in the buffer, and not rely on the way how the buffer is allocated. > And why is the first memcpy replacement needed? gcc doesn't say > anything about that, does it? > No, that is more or less for efficiency reasons, since it is writing the NUL which is always overwritten with something different in the next step. If the loop is executed zero times, there result is not explicitly zero-terminated either, so using strcpy is somehow misleading. Well, I would say personal taste, if the loop below constructs the string with memcpy the loop above can do the same. If you prefer the strcpy in the first loop, I have no strong preference here. Thanks Bernd. > thanks, > > greg k-h >
[PATCHv2] Fix range checks in kernfs_get_target_path
The strncpy causes a warning [-Wstringop-truncation] here, which indicates that it never appends a NUL byte to the path. The NUL byte is only there because the buffer is allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check is also off-by-one, and PAGE_SIZE==PATH_MAX the returned string will not be zero-terminated if it is exactly PATH_MAX characters. Furthermore also the initial loop may theoretically exceed PATH_MAX and cause a fault. Signed-off-by: Bernd Edlinger --- fs/kernfs/symlink.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 08ccabd..c8b7d44a 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -63,7 +63,10 @@ static int kernfs_get_target_path(struct kernfs_node if (base == kn) break; - strcpy(s, "../"); + if ((s - path) + 3 >= PATH_MAX) + return -ENAMETOOLONG; + + memcpy(s, "../", 3); s += 3; base = base->parent; } @@ -79,16 +82,17 @@ static int kernfs_get_target_path(struct kernfs_node if (len < 2) return -EINVAL; len--; - if ((s - path) + len > PATH_MAX) + if ((s - path) + len >= PATH_MAX) return -ENAMETOOLONG; /* reverse fillup of target string from target to base */ kn = target; + s[len] = '\0'; while (kn->parent && kn != base) { int slen = strlen(kn->name); len -= slen; - strncpy(s + len, kn->name, slen); + memcpy(s + len, kn->name, slen); if (len) s[--len] = '/'; -- 1.9.1
[PATCHv2] Fix range checks in kernfs_get_target_path
The strncpy causes a warning [-Wstringop-truncation] here, which indicates that it never appends a NUL byte to the path. The NUL byte is only there because the buffer is allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check is also off-by-one, and PAGE_SIZE==PATH_MAX the returned string will not be zero-terminated if it is exactly PATH_MAX characters. Furthermore also the initial loop may theoretically exceed PATH_MAX and cause a fault. Signed-off-by: Bernd Edlinger --- fs/kernfs/symlink.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 08ccabd..c8b7d44a 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -63,7 +63,10 @@ static int kernfs_get_target_path(struct kernfs_node if (base == kn) break; - strcpy(s, "../"); + if ((s - path) + 3 >= PATH_MAX) + return -ENAMETOOLONG; + + memcpy(s, "../", 3); s += 3; base = base->parent; } @@ -79,16 +82,17 @@ static int kernfs_get_target_path(struct kernfs_node if (len < 2) return -EINVAL; len--; - if ((s - path) + len > PATH_MAX) + if ((s - path) + len >= PATH_MAX) return -ENAMETOOLONG; /* reverse fillup of target string from target to base */ kn = target; + s[len] = '\0'; while (kn->parent && kn != base) { int slen = strlen(kn->name); len -= slen; - strncpy(s + len, kn->name, slen); + memcpy(s + len, kn->name, slen); if (len) s[--len] = '/'; -- 1.9.1
[PATCH] Fix range checks in kernfs_get_target_path
The strncpy causes a warning [-Wstringop-truncation] here, which indicates that it never appends a NUL byte to the path. The NUL byte is only there because the buffer is allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check is also off-by-one, and PAGE_SIZE==PATH_MAX the returned string will not be zero-terminated if it is exactly PATH_MAX characters. Furthermore also the initial loop may theoretically exceed PATH_MAX and cause a fault. Signed-off-by: Bernd Edlinger --- fs/kernfs/symlink.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 08ccabd..c8b7d44a 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -63,7 +63,10 @@ static int kernfs_get_target_path(struct kernfs_node *parent, if (base == kn) break; - strcpy(s, "../"); + if ((s - path) + 3 >= PATH_MAX) + return -ENAMETOOLONG; + + memcpy(s, "../", 3); s += 3; base = base->parent; } @@ -79,16 +82,17 @@ static int kernfs_get_target_path(struct kernfs_node *parent, if (len < 2) return -EINVAL; len--; - if ((s - path) + len > PATH_MAX) + if ((s - path) + len >= PATH_MAX) return -ENAMETOOLONG; /* reverse fillup of target string from target to base */ kn = target; + s[len] = '\0'; while (kn->parent && kn != base) { int slen = strlen(kn->name); len -= slen; - strncpy(s + len, kn->name, slen); + memcpy(s + len, kn->name, slen); if (len) s[--len] = '/'; -- 1.9.1
[PATCH] Fix range checks in kernfs_get_target_path
The strncpy causes a warning [-Wstringop-truncation] here, which indicates that it never appends a NUL byte to the path. The NUL byte is only there because the buffer is allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check is also off-by-one, and PAGE_SIZE==PATH_MAX the returned string will not be zero-terminated if it is exactly PATH_MAX characters. Furthermore also the initial loop may theoretically exceed PATH_MAX and cause a fault. Signed-off-by: Bernd Edlinger --- fs/kernfs/symlink.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 08ccabd..c8b7d44a 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -63,7 +63,10 @@ static int kernfs_get_target_path(struct kernfs_node *parent, if (base == kn) break; - strcpy(s, "../"); + if ((s - path) + 3 >= PATH_MAX) + return -ENAMETOOLONG; + + memcpy(s, "../", 3); s += 3; base = base->parent; } @@ -79,16 +82,17 @@ static int kernfs_get_target_path(struct kernfs_node *parent, if (len < 2) return -EINVAL; len--; - if ((s - path) + len > PATH_MAX) + if ((s - path) + len >= PATH_MAX) return -ENAMETOOLONG; /* reverse fillup of target string from target to base */ kn = target; + s[len] = '\0'; while (kn->parent && kn != base) { int slen = strlen(kn->name); len -= slen; - strncpy(s + len, kn->name, slen); + memcpy(s + len, kn->name, slen); if (len) s[--len] = '/'; -- 1.9.1
[PATCH] Add a configure option ARM_ERRATA_WFE_BROKEN
This works around problems with Altera Cyclone5 Devices. See https://bugzilla.kernel.org/show_bug.cgi?id=111951 I just wanted to let you know that we use this patch since more than two years now and the problem never repeated for us. There is one possibly related observation with CONFIG_PL310_ERRATA_769419 which was always enabled, although it should be unnecessary according to the comment in arch/arm/mm/cache-l2x0.c: * 769419: PL310 R0P0->R3P1, fixed R3P2. * Affects: store buffer * store buffer is not automatically drained. This erratum should not be needed as we use R3P3, but when this was disabled a kind of dead-lock happened after some time. It might be connected to this problem since ERRATA_769419 injects a wmb in the arch_cpu_idle_enter which is one of the last actions before the wfi instruction which happens in the idle function. So the current working theory is the cache might freeze before all data is written out. Signed-off-by: Bernd Edlinger --- arch/arm/Kconfig| 7 +++ arch/arm/include/asm/spinlock.h | 7 +++ 2 files changed, 14 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index a7f8e7f..f96af3b 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1215,6 +1215,13 @@ config ARM_ERRATA_852423 config option from the A12 erratum due to the way errata are checked for and handled. +config ARM_ERRATA_WFE_BROKEN + bool "ARM errata: wfe broken" + depends on CPU_V7 && SMP + help + This option disables the wfe instruction for raw_spin_locks. + At least some CycloneV SoC devices wait much too long in wfe. + endmenu source "arch/arm/common/Kconfig" diff --git a/arch/arm/include/asm/spinlock.h b/arch/arm/include/asm/spinlock.h index 099c78f..7efc431 100644 --- a/arch/arm/include/asm/spinlock.h +++ b/arch/arm/include/asm/spinlock.h @@ -10,6 +10,12 @@ #include #include +#ifdef CONFIG_ARM_ERRATA_WFE_BROKEN +#undef wfe +#define wfe() +#define WFE(x) +#define dsb_sev() +#else /* * sev and wfe are ARMv6K extensions. Uniprocessor ARMv6 may not have the K * extensions, so when running on UP, we have to patch these instructions away. @@ -44,6 +50,7 @@ static inline void dsb_sev(void) dsb(ishst); __asm__(SEV); } +#endif /* * ARMv6 ticket-based spin-locking. -- 2.7.4
[PATCH] Add a configure option ARM_ERRATA_WFE_BROKEN
This works around problems with Altera Cyclone5 Devices. See https://bugzilla.kernel.org/show_bug.cgi?id=111951 I just wanted to let you know that we use this patch since more than two years now and the problem never repeated for us. There is one possibly related observation with CONFIG_PL310_ERRATA_769419 which was always enabled, although it should be unnecessary according to the comment in arch/arm/mm/cache-l2x0.c: * 769419: PL310 R0P0->R3P1, fixed R3P2. * Affects: store buffer * store buffer is not automatically drained. This erratum should not be needed as we use R3P3, but when this was disabled a kind of dead-lock happened after some time. It might be connected to this problem since ERRATA_769419 injects a wmb in the arch_cpu_idle_enter which is one of the last actions before the wfi instruction which happens in the idle function. So the current working theory is the cache might freeze before all data is written out. Signed-off-by: Bernd Edlinger --- arch/arm/Kconfig| 7 +++ arch/arm/include/asm/spinlock.h | 7 +++ 2 files changed, 14 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index a7f8e7f..f96af3b 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1215,6 +1215,13 @@ config ARM_ERRATA_852423 config option from the A12 erratum due to the way errata are checked for and handled. +config ARM_ERRATA_WFE_BROKEN + bool "ARM errata: wfe broken" + depends on CPU_V7 && SMP + help + This option disables the wfe instruction for raw_spin_locks. + At least some CycloneV SoC devices wait much too long in wfe. + endmenu source "arch/arm/common/Kconfig" diff --git a/arch/arm/include/asm/spinlock.h b/arch/arm/include/asm/spinlock.h index 099c78f..7efc431 100644 --- a/arch/arm/include/asm/spinlock.h +++ b/arch/arm/include/asm/spinlock.h @@ -10,6 +10,12 @@ #include #include +#ifdef CONFIG_ARM_ERRATA_WFE_BROKEN +#undef wfe +#define wfe() +#define WFE(x) +#define dsb_sev() +#else /* * sev and wfe are ARMv6K extensions. Uniprocessor ARMv6 may not have the K * extensions, so when running on UP, we have to patch these instructions away. @@ -44,6 +50,7 @@ static inline void dsb_sev(void) dsb(ishst); __asm__(SEV); } +#endif /* * ARMv6 ticket-based spin-locking. -- 2.7.4
[PATCH] stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
This is the possible reason for different hard to reproduce problems on my ARMv7-SMP test system. The symptoms are in recent kernels imprecise external aborts, and in older kernels various kinds of network stalls and unexpected page allocation failures. My testing indicates that the trouble started between v4.5 and v4.6 and prevails up to v4.14. Using the dirty_tx before acquiring the spin lock is clearly wrong and was first introduced with v4.6. Fixes: e3ad57c96715 ("stmmac: review RX/TX ring management") Signed-off-by: Bernd Edlinger <bernd.edlin...@hotmail.de> --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 1763e48..c8e280fa 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1800,12 +1800,13 @@ static void stmmac_tx_clean(struct stmmac_priv *priv, u32 queue) { struct stmmac_tx_queue *tx_q = >tx_queue[queue]; unsigned int bytes_compl = 0, pkts_compl = 0; - unsigned int entry = tx_q->dirty_tx; + unsigned int entry; netif_tx_lock(priv->dev); priv->xstats.tx_clean++; + entry = tx_q->dirty_tx; while (entry != tx_q->cur_tx) { struct sk_buff *skb = tx_q->tx_skbuff[entry]; struct dma_desc *p; -- 1.9.1
[PATCH] stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
This is the possible reason for different hard to reproduce problems on my ARMv7-SMP test system. The symptoms are in recent kernels imprecise external aborts, and in older kernels various kinds of network stalls and unexpected page allocation failures. My testing indicates that the trouble started between v4.5 and v4.6 and prevails up to v4.14. Using the dirty_tx before acquiring the spin lock is clearly wrong and was first introduced with v4.6. Fixes: e3ad57c96715 ("stmmac: review RX/TX ring management") Signed-off-by: Bernd Edlinger --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 1763e48..c8e280fa 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1800,12 +1800,13 @@ static void stmmac_tx_clean(struct stmmac_priv *priv, u32 queue) { struct stmmac_tx_queue *tx_q = >tx_queue[queue]; unsigned int bytes_compl = 0, pkts_compl = 0; - unsigned int entry = tx_q->dirty_tx; + unsigned int entry; netif_tx_lock(priv->dev); priv->xstats.tx_clean++; + entry = tx_q->dirty_tx; while (entry != tx_q->cur_tx) { struct sk_buff *skb = tx_q->tx_skbuff[entry]; struct dma_desc *p; -- 1.9.1