Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2021-07-21 Thread Sergei Golubchik
Hi, Aleksey!

On Jun 10, Aleksey Midenkov wrote:
> On Thu, Jun 3, 2021 at 1:58 PM Sergei Golubchik  wrote:
> > Hi, Aleksey!
> >
> > On Jan 26, Aleksey Midenkov wrote:
> > > On Tue, Dec 22, 2020 at 7:58 PM Sergei Golubchik  wrote:
> > > > On Nov 03, Aleksey Midenkov wrote:
> > > >
> > > > And please, pretty please, don't use non-constant references as
> > > > function arguments.
> > >
> > > Why? With pointers we can pass NULL and we don't need NULL there, do we?
> >
> > Because this is the policy. If you convince Monty that non-constant
> > references are fine - feel free to use them.
> 
> If this is the policy, why there is
> 
> void String::swap(String )
> 
> since 2004?

because, of course, we've never had a consistent written policy that
everyone would've followed. And because Monty tends to revert new
commits, not 17yr old ones.

> Are you asking me to pass the pointer and then dereference it and pass
> it as a reference into String class? That would look weird to me. I'm
> going to agree if you insist, but really?

Oh, right. Your swap() is just calling String::swap(), so it copies its
prototype. Makes sense. Okay, let's keep it like String::swap()

Still, your

String * cached(bool _read_value)

isn't "inheriting" the prototype from anything, better use a pointer
there.

> > > It is tested by federated.federated_partition test case.
> >
> > Do you mean, that if I remove this line,
> > federated.federated_partition test will fail? If yes - good, this is
> > the test I meant, no need to create another one.
> 
> Yes, the test will fail.

Great, so no new test is needed.

> > > Why do we have to extract this side-fix to a separate commit?
> >
> > Because later when someone looks at the code it helps to be able to
> > understand what a commit is actually fixing.
> 
> As long as both fixes are important and there is no sense to apply one
> without the other I see separate commits as a measure to satisfy
> someone's curiosity which I do not share (and there is nothing to be
> curious about that one-liner). Though I'm going to agree if you ask me
> to do that.

Generally, yes, it's *hugely* helpful to look at the line and see why it
was added.

For example, this commit:
https://github.com/MariaDB/server/commit/aba91154264

Here it's very clear what's happening, one doesn't need even to read the
commit comment or the test.

As a counter example, MySQL commit (they have a policy for doing that,
as far as I understand, our tree doesn't have such monsters):
https://github.com/mysql/mysql-server/commit/67c3c70e489

 447 files changed, 28239 insertions(+), 21952 deletions(-)

many are tests, but in sql alone:

 126 files changed, 8975 insertions(+), 6318 deletions(-)

But in this particular case of your commit. If both changes, indeed,
make no sense without each other, then they could be in the same commit.

> > > > > > > @@ -6310,6 +6349,43 @@ int 
> > > > > > > ha_partition::handle_ordered_index_scan_key_not_found()
> > > > > > > +  if (cached)
> > > > > > > +  {
> > > > > > > +cached->swap(s.blob);
> > > > > > > +s.set_read_value= set_read_value;
> > > > > >
> > > > > > is it indeed possible here for a value to be either in `value` or in
> > > > > > `read_value` ?
> > > > > >
> > > > > > When happens what?
> > > > >
> > > > > Yes, it uses `value` for plain fields and `read_value` for virtual 
> > > > > fields.
> > > >
> > > > Oh, I see.
> > > >
> > > > This read_value was introduced in ea1b25046c81db8fdf to solve
> > > > the case of UPDATE, where a row is moved with
> > > > store_record(table, record[1]) and blob pointers aren't updated.
> > > > This is very similar to what you're fixing, perhaps read_value
> > > > should go away and both bugs should have a one unified solution
> > > > that allows to "park" a record somewhere out of record[0] and
> > > > restore later, all with proper blob handling?
> > >
> > > Do you think this is necessary now? I believe there are a lot of
> > > *much* more important tasks.
> >
> > Uhm, yes? This would be a notable code simplification, and it won't
> > take long, a low-hanging fruit.
> 
> I'm not sure if this is a "fruit". Can you please elaborate the
> change? Use so-called Ordered_blob_storage instead of read_value and
> do swap_blobs() on store_record()? And have 2 arrays of
> Ordered_blob_storage for each TABLE::record like that:

May be it isn't a as low-hanging as it looked, indeed. Let's get your
fix first, then I'll check again. Just don't like to different
approaches to essentially the same problem

Regards,
Sergei
VP of MariaDB Server Engineering
and secur...@mariadb.org

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2021-06-10 Thread Aleksey Midenkov
Hi Sergei!

On Thu, Jun 3, 2021 at 1:58 PM Sergei Golubchik  wrote:
>
> Hi, Aleksey!
>
> On Jan 26, Aleksey Midenkov wrote:
> > On Tue, Dec 22, 2020 at 7:58 PM Sergei Golubchik  wrote:
> > > On Nov 03, Aleksey Midenkov wrote:
> > >
> > > And please, pretty please, don't use non-constant references as
> > > function arguments.
> >
> > Why? With pointers we can pass NULL and we don't need NULL there, do we?
>
> Because this is the policy. If you convince Monty that non-constant
> references are fine - feel free to use them.

If this is the policy, why there is

void String::swap(String )

since 2004? Are you asking me to pass the pointer and then dereference
it and pass it as a reference into String class? That would look weird
to me. I'm going to agree if you insist, but really?

>
> > > > On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik  
> > > > wrote:
> > > > >
> > > > > > diff --git a/sql/field.cc b/sql/field.cc
> > > > > > index bdaaecc20269..0fd40c979d2c 100644
> > > > > > --- a/sql/field.cc
> > > > > > +++ b/sql/field.cc
> > > > > > @@ -8310,6 +8310,7 @@ int Field_blob::store(const char *from,uint 
> > > > > > length,CHARSET_INFO *cs)
> > > > > >copy_length= copier.well_formed_copy(field_charset,
> > > > > > (char*) value.ptr(), 
> > > > > > new_length,
> > > > > > cs, from, length);
> > > > > > +  value.length(copy_length);
> > > > >
> > > > > good! Could this be a distinct bug with its own test case?
> > > >
> > > > Do we need to spend time on that bureaucracy? There are real problems to
> > > > solve... Tickets drain time from the management people and generally 
> > > > from
> > > > anyone who sees them.
> > >
> > > I didn't mean opening an MDEV. Only writing a test case and extracting
> > > it into a separate commit.
> > >
> > > If this is a genuine bug fix and you don't create a test case - how can
> > > you be sure the fix will stay? It may disappear in a merge or a rebase,
> > > manually or automatically. Or in a refactoring that "looks fine, all
> > > tests pass". If you can create a test case - please, do.
> >
> > It is tested by federated.federated_partition test case.
>
> Do you mean, that if I remove this line, federated.federated_partition
> test will fail? If yes - good, this is the test I meant, no need to
> create another one.

Yes, the test will fail.

>
> > Why do we have to extract this side-fix to a separate commit?
>
> Because later when someone looks at the code it helps to be able to
> understand what a commit is actually fixing.

As long as both fixes are important and there is no sense to apply one
without the other I see separate commits as a measure to satisfy
someone's curiosity which I do not share (and there is nothing to be
curious about that one-liner). Though I'm going to agree if you ask me
to do that.

>
> > > > > > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > > > > > --- a/sql/ha_partition.cc
> > > > > > +++ b/sql/ha_partition.cc
> > > > > > @@ -5125,14 +5128,14 @@ bool 
> > > > > > ha_partition::init_record_priority_queue()
> > > > > >uint alloc_len;
> > > > > >uint used_parts= bitmap_bits_set(_part_info->read_partitions);
> > > > > >/* Allocate record buffer for each used partition. */
> > > > > > -  m_priority_queue_rec_len= m_rec_length + PARTITION_BYTES_IN_POS;
> > > > > > +  m_priority_queue_rec_len= m_rec_length + ORDERED_REC_OFFSET;
> > > > > >if (!m_using_extended_keys)
> > > > > >m_priority_queue_rec_len += m_file[0]->ref_length;
> > > > > >alloc_len= used_parts * m_priority_queue_rec_len;
> > > > > >/* Allocate a key for temporary use when setting up the scan. */
> > > > > >alloc_len+= table_share->max_key_length;
> > > > > >
> > > > > > -  if (!(m_ordered_rec_buffer= (uchar*)my_malloc(alloc_len, 
> > > > > > MYF(MY_WME
> > > > > > +  if (!(m_ordered_rec_buffer= (uchar*) alloc_root(_ordered_root, 
> > > > > > alloc_len)))
> > > > >
> > > > > I don't see why you need a memroot here. memroot is needed when
> > > > > you allocate later at difeerent points in time some initially
> > > > > unpredictable amount of memory that has the same lifetime and
> > > > > needs to be freed all at once.
> > > >
> > One should keep in mind this is an artificial usage of the prio queue.
> > In diversified real-world usage the share of such kind SELECT is say
> > 0.1%. But even if it is 1% we divide our stats by 100. So, do you
> > really care about 0.0015%?
>
> A right tool for a job. If you need to allocate few buffers at the same
> time - use multi_alloc, if you do many allocations over time and want to
> fee them at once - use memroot.

Changed to my_multi_alloc().

>
> > > > > > @@ -6178,7 +6203,11 @@ int 
> > > > > > ha_partition::handle_ordered_index_scan(uchar *buf, bool 
> > > > > > reverse_order)
> > > > > >*/
> > > > > >error= file->read_range_first(m_start_key.key? _start_key: 
> > > > > > NULL,
> > > > > >

Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2021-06-04 Thread Sergei Golubchik
Hi, Aleksey!

On Jun 03, Aleksey Midenkov wrote:
> Hi Sergei!
> 
> Do you plan to fix everything you just said and push yourself or let
> me push it as is?

Neither. As a reviewer I plan to reply "ok to push" when the code is
changed according to the review.

> Btw, there were many allocations over one loop: that's why I used
> mem_root. multi_malloc() won't do the job there. That could be done
> either with explicit moving pointer or some utility class holding such
> pointer. I decided to upgrade mem_root for such pattern so it could do
> the double job and we had a cleaner code.

Yes, I mean

  my_multi_malloc(MYF(MY_WME),
_ordered_rec_buffer, alloc_len,
_storage, table->s->blob_fields * sizeof(Ordered_blob_storage *),
, table->s->blob_fields * sizeof(Ordered_blob_storage), NULL);

  for (uint j= 0; j < table->s->blob_fields; ++j, ++ptr)
blob_storage[j]= new (ptr) Ordered_blob_storage;

Regards,
Sergei
VP of MariaDB Server Engineering
and secur...@mariadb.org

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2021-06-03 Thread Aleksey Midenkov
Hi Sergei!

Do you plan to fix everything you just said and push yourself or let
me push it as is? Btw, there were many allocations over one loop:
that's why I used mem_root. multi_malloc() won't do the job there.
That could be done either with explicit moving pointer or some utility
class holding such pointer. I decided to upgrade mem_root for such
pattern so it could do the double job and we had a cleaner code.

On Thu, Jun 3, 2021 at 2:00 PM Sergei Golubchik  wrote:
>
> Hi, Aleksey!
>
> I think it's better to wrap it up quickly, agree on changes and push the
> fix before the next release.
>
> On Jun 01, Aleksey Midenkov wrote:
> > Hi Sergei!
> >
> > Users could benefit from this fix of MDEV-18734 since the last
> > changes. Do you really think the open questions left are so important
> > for this bug to be still not pushed?
> >
> Regards,
> Sergei
> VP of MariaDB Server Engineering
> and secur...@mariadb.org



-- 
All the best,

Aleksey Midenkov
@midenok

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2021-06-03 Thread Sergei Golubchik
Hi, Aleksey!

I think it's better to wrap it up quickly, agree on changes and push the
fix before the next release.

On Jun 01, Aleksey Midenkov wrote:
> Hi Sergei!
> 
> Users could benefit from this fix of MDEV-18734 since the last
> changes. Do you really think the open questions left are so important
> for this bug to be still not pushed?
> 
Regards,
Sergei
VP of MariaDB Server Engineering
and secur...@mariadb.org

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2021-06-03 Thread Sergei Golubchik
Hi, Aleksey!

On Jan 26, Aleksey Midenkov wrote:
> On Tue, Dec 22, 2020 at 7:58 PM Sergei Golubchik  wrote:
> > On Nov 03, Aleksey Midenkov wrote:
> >
> > And please, pretty please, don't use non-constant references as
> > function arguments.
> 
> Why? With pointers we can pass NULL and we don't need NULL there, do we?

Because this is the policy. If you convince Monty that non-constant
references are fine - feel free to use them.

> > > On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik  wrote:
> > > >
> > > > > diff --git a/sql/field.cc b/sql/field.cc
> > > > > index bdaaecc20269..0fd40c979d2c 100644
> > > > > --- a/sql/field.cc
> > > > > +++ b/sql/field.cc
> > > > > @@ -8310,6 +8310,7 @@ int Field_blob::store(const char *from,uint 
> > > > > length,CHARSET_INFO *cs)
> > > > >copy_length= copier.well_formed_copy(field_charset,
> > > > > (char*) value.ptr(), 
> > > > > new_length,
> > > > > cs, from, length);
> > > > > +  value.length(copy_length);
> > > >
> > > > good! Could this be a distinct bug with its own test case?
> > >
> > > Do we need to spend time on that bureaucracy? There are real problems to
> > > solve... Tickets drain time from the management people and generally from
> > > anyone who sees them.
> >
> > I didn't mean opening an MDEV. Only writing a test case and extracting
> > it into a separate commit.
> >
> > If this is a genuine bug fix and you don't create a test case - how can
> > you be sure the fix will stay? It may disappear in a merge or a rebase,
> > manually or automatically. Or in a refactoring that "looks fine, all
> > tests pass". If you can create a test case - please, do.
> 
> It is tested by federated.federated_partition test case.

Do you mean, that if I remove this line, federated.federated_partition
test will fail? If yes - good, this is the test I meant, no need to
create another one.

> Why do we have to extract this side-fix to a separate commit?

Because later when someone looks at the code it helps to be able to
understand what a commit is actually fixing.

> > > > > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > > > > --- a/sql/ha_partition.cc
> > > > > +++ b/sql/ha_partition.cc
> > > > > @@ -5125,14 +5128,14 @@ bool 
> > > > > ha_partition::init_record_priority_queue()
> > > > >uint alloc_len;
> > > > >uint used_parts= bitmap_bits_set(_part_info->read_partitions);
> > > > >/* Allocate record buffer for each used partition. */
> > > > > -  m_priority_queue_rec_len= m_rec_length + PARTITION_BYTES_IN_POS;
> > > > > +  m_priority_queue_rec_len= m_rec_length + ORDERED_REC_OFFSET;
> > > > >if (!m_using_extended_keys)
> > > > >m_priority_queue_rec_len += m_file[0]->ref_length;
> > > > >alloc_len= used_parts * m_priority_queue_rec_len;
> > > > >/* Allocate a key for temporary use when setting up the scan. */
> > > > >alloc_len+= table_share->max_key_length;
> > > > >
> > > > > -  if (!(m_ordered_rec_buffer= (uchar*)my_malloc(alloc_len, 
> > > > > MYF(MY_WME
> > > > > +  if (!(m_ordered_rec_buffer= (uchar*) alloc_root(_ordered_root, 
> > > > > alloc_len)))
> > > >
> > > > I don't see why you need a memroot here. memroot is needed when
> > > > you allocate later at difeerent points in time some initially
> > > > unpredictable amount of memory that has the same lifetime and
> > > > needs to be freed all at once.
> > >
> One should keep in mind this is an artificial usage of the prio queue.
> In diversified real-world usage the share of such kind SELECT is say
> 0.1%. But even if it is 1% we divide our stats by 100. So, do you
> really care about 0.0015%?

A right tool for a job. If you need to allocate few buffers at the same
time - use multi_alloc, if you do many allocations over time and want to
fee them at once - use memroot.

> > > > > @@ -6178,7 +6203,11 @@ int 
> > > > > ha_partition::handle_ordered_index_scan(uchar *buf, bool 
> > > > > reverse_order)
> > > > >*/
> > > > >error= file->read_range_first(m_start_key.key? _start_key: 
> > > > > NULL,
> > > > >  end_range, eq_range, TRUE);
> > > > > -  memcpy(rec_buf_ptr, table->record[0], m_rec_length);
> > > > > +  if (!error)
> > > > > +  {
> > > > > +memcpy(rec_buf_ptr, table->record[0], m_rec_length);
> > > > > +  }
> > > >
> > > > did you have a bug because of that? something didn't work?
> > >
> > > No, I just saw worthless memcpy and avoided it.
> >
> > You're right that memcopy is unnecessary in case of an error.
> > But most of the time there is no error. You made an exceptional code path
> > faster by making a common code path slower. I think it's generally
> > better to do the opposite, making the common, no-error code path
> > faster, fewer conditionals, in particular. At the cost of making the error
> > path slower, if needed.
> 
> I'm usually aware of such things in my decisions. The slowdown is
> 

Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2021-06-01 Thread Aleksey Midenkov
Hi Sergei!

Users could benefit from this fix of MDEV-18734 since the last
changes. Do you really think the open questions left are so important
for this bug to be still not pushed?

On Tue, Jan 26, 2021 at 1:51 AM Aleksey Midenkov  wrote:
>
> Hi Sergei,
>
> On Tue, Dec 22, 2020 at 7:58 PM Sergei Golubchik  wrote:
> >
> > Hi, Aleksey!
> >
> > On Nov 03, Aleksey Midenkov wrote:
> > >
> > > the updated patches are here:
> > >
> > > https://github.com/MariaDB/server/commits/5f6bcd6cd0145513974b0dfc5b2cefba28b72f1a
> >
> > I started writing my comments there, but realized that it'd be clearer
> > if I reply here, more context. That is, see below.
> >
> > One comment about the new test case, you have 16K lines in the result
> > file, are all 16K necessary? Can you do something like
> >
> >select x, left(b,10), left(v,10)
> >
> > ?
>
> Yes, these shorter result cases reproduce as well. Updated.
>
> >
> > And please, pretty please, don't use non-constant references as function
> > arguments.
>
> Why? With pointers we can pass NULL and we don't need NULL there, do we?
>
> >
> > > On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik  wrote:
> > > >
> > > > > diff --git a/sql/field.cc b/sql/field.cc
> > > > > index bdaaecc20269..0fd40c979d2c 100644
> > > > > --- a/sql/field.cc
> > > > > +++ b/sql/field.cc
> > > > > @@ -8310,6 +8310,7 @@ int Field_blob::store(const char *from,uint 
> > > > > length,CHARSET_INFO *cs)
> > > > >copy_length= copier.well_formed_copy(field_charset,
> > > > > (char*) value.ptr(), 
> > > > > new_length,
> > > > > cs, from, length);
> > > > > +  value.length(copy_length);
> > > >
> > > > good! Could this be a distinct bug with its own test case?
> > >
> > > Do we need to spend time on that bureaucracy? There are real problems to
> > > solve... Tickets drain time from the management people and generally from
> > > anyone who sees them.
> >
> > I didn't mean opening an MDEV. Only writing a test case and extracting
> > it into a separate commit.
> >
> > If this is a genuine bug fix and you don't create a test case - how can
> > you be sure the fix will stay? It may disappear in a merge or a rebase,
> > manually or automatically. Or in a refactoring that "looks fine, all
> > tests pass". If you can create a test case - please, do.
>
> It is tested by federated.federated_partition test case. Why do we
> have to extract this side-fix to a separate commit? Looks anyway as a
> bureaucracy to me. The main bug is critical, the test case tests both
> bugs nicely. If the partitioning code is refactored, the test case
> still tests that side-fix.
>
> >
> > > > >Field_blob::store_length(copy_length);
> > > > >bmove(ptr+packlength,(uchar*) ,sizeof(char*));
> > > > >
> > > > > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > > > > index 9d82df0235c7..7c658082d397 100644
> > > > > --- a/sql/ha_partition.cc
> > > > > +++ b/sql/ha_partition.cc
> > > > > @@ -5125,14 +5128,14 @@ bool 
> > > > > ha_partition::init_record_priority_queue()
> > > > >uint alloc_len;
> > > > >uint used_parts= bitmap_bits_set(_part_info->read_partitions);
> > > > >/* Allocate record buffer for each used partition. */
> > > > > -  m_priority_queue_rec_len= m_rec_length + PARTITION_BYTES_IN_POS;
> > > > > +  m_priority_queue_rec_len= m_rec_length + ORDERED_REC_OFFSET;
> > > > >if (!m_using_extended_keys)
> > > > >m_priority_queue_rec_len += m_file[0]->ref_length;
> > > > >alloc_len= used_parts * m_priority_queue_rec_len;
> > > > >/* Allocate a key for temporary use when setting up the scan. */
> > > > >alloc_len+= table_share->max_key_length;
> > > > >
> > > > > -  if (!(m_ordered_rec_buffer= (uchar*)my_malloc(alloc_len, 
> > > > > MYF(MY_WME
> > > > > +  if (!(m_ordered_rec_buffer= (uchar*) alloc_root(_ordered_root, 
> > > > > alloc_len)))
> > > >
> > > > I don't see why you need a memroot here. memroot is needed when you
> > > > allocate later at difeerent points in time some initially unpredictable
> > > > amount of memory that has the same lifetime and needs to be
> > > > freed all at once.
> > >
> > > I know... This is just another convenient way of allocation with 
> > > presumably
> > > minimized number of mallocs (probably pre_alloc_size should be tweaked to 
> > > a
> > > better value).
> >
> > memroot wasn't created for that. There are approaches with much less
> > overhead
>
> Using memory pools like that is a nice design pattern for non-critical
> sections. At least it is nicer than manual pointer management. I
> believe I've seen the similar pattern in InnoDB with its memory pool
> mem_heap_t even without preallocation! It would be nice to have a
> special class that preallocates and manages pointers, but I do not see
> the reason to develop that for this specific use case.
>
> I don't believe in *much* overhead: mem_root is a simplest algorithm,
> a couple of ifs and one loop. 

Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2021-01-25 Thread Aleksey Midenkov
Hi Sergei,

On Tue, Dec 22, 2020 at 7:58 PM Sergei Golubchik  wrote:
>
> Hi, Aleksey!
>
> On Nov 03, Aleksey Midenkov wrote:
> >
> > the updated patches are here:
> >
> > https://github.com/MariaDB/server/commits/5f6bcd6cd0145513974b0dfc5b2cefba28b72f1a
>
> I started writing my comments there, but realized that it'd be clearer
> if I reply here, more context. That is, see below.
>
> One comment about the new test case, you have 16K lines in the result
> file, are all 16K necessary? Can you do something like
>
>select x, left(b,10), left(v,10)
>
> ?

Yes, these shorter result cases reproduce as well. Updated.

>
> And please, pretty please, don't use non-constant references as function
> arguments.

Why? With pointers we can pass NULL and we don't need NULL there, do we?

>
> > On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik  wrote:
> > >
> > > > diff --git a/sql/field.cc b/sql/field.cc
> > > > index bdaaecc20269..0fd40c979d2c 100644
> > > > --- a/sql/field.cc
> > > > +++ b/sql/field.cc
> > > > @@ -8310,6 +8310,7 @@ int Field_blob::store(const char *from,uint 
> > > > length,CHARSET_INFO *cs)
> > > >copy_length= copier.well_formed_copy(field_charset,
> > > > (char*) value.ptr(), new_length,
> > > > cs, from, length);
> > > > +  value.length(copy_length);
> > >
> > > good! Could this be a distinct bug with its own test case?
> >
> > Do we need to spend time on that bureaucracy? There are real problems to
> > solve... Tickets drain time from the management people and generally from
> > anyone who sees them.
>
> I didn't mean opening an MDEV. Only writing a test case and extracting
> it into a separate commit.
>
> If this is a genuine bug fix and you don't create a test case - how can
> you be sure the fix will stay? It may disappear in a merge or a rebase,
> manually or automatically. Or in a refactoring that "looks fine, all
> tests pass". If you can create a test case - please, do.

It is tested by federated.federated_partition test case. Why do we
have to extract this side-fix to a separate commit? Looks anyway as a
bureaucracy to me. The main bug is critical, the test case tests both
bugs nicely. If the partitioning code is refactored, the test case
still tests that side-fix.

>
> > > >Field_blob::store_length(copy_length);
> > > >bmove(ptr+packlength,(uchar*) ,sizeof(char*));
> > > >
> > > > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > > > index 9d82df0235c7..7c658082d397 100644
> > > > --- a/sql/ha_partition.cc
> > > > +++ b/sql/ha_partition.cc
> > > > @@ -5125,14 +5128,14 @@ bool ha_partition::init_record_priority_queue()
> > > >uint alloc_len;
> > > >uint used_parts= bitmap_bits_set(_part_info->read_partitions);
> > > >/* Allocate record buffer for each used partition. */
> > > > -  m_priority_queue_rec_len= m_rec_length + PARTITION_BYTES_IN_POS;
> > > > +  m_priority_queue_rec_len= m_rec_length + ORDERED_REC_OFFSET;
> > > >if (!m_using_extended_keys)
> > > >m_priority_queue_rec_len += m_file[0]->ref_length;
> > > >alloc_len= used_parts * m_priority_queue_rec_len;
> > > >/* Allocate a key for temporary use when setting up the scan. */
> > > >alloc_len+= table_share->max_key_length;
> > > >
> > > > -  if (!(m_ordered_rec_buffer= (uchar*)my_malloc(alloc_len, 
> > > > MYF(MY_WME
> > > > +  if (!(m_ordered_rec_buffer= (uchar*) alloc_root(_ordered_root, 
> > > > alloc_len)))
> > >
> > > I don't see why you need a memroot here. memroot is needed when you
> > > allocate later at difeerent points in time some initially unpredictable
> > > amount of memory that has the same lifetime and needs to be
> > > freed all at once.
> >
> > I know... This is just another convenient way of allocation with presumably
> > minimized number of mallocs (probably pre_alloc_size should be tweaked to a
> > better value).
>
> memroot wasn't created for that. There are approaches with much less
> overhead

Using memory pools like that is a nice design pattern for non-critical
sections. At least it is nicer than manual pointer management. I
believe I've seen the similar pattern in InnoDB with its memory pool
mem_heap_t even without preallocation! It would be nice to have a
special class that preallocates and manages pointers, but I do not see
the reason to develop that for this specific use case.

I don't believe in *much* overhead: mem_root is a simplest algorithm,
a couple of ifs and one loop. Initialization call
init_record_priority_queue() is not a critical section, the major
execution is mostly busy by a data copy in
handle_ordered_index_scan(), handle_ordered_next().

Let's see gprof stats collected for 100k SELECTs attached in stats.txt

init_record_priority_queue() takes 0.3%; handle_ordered_index_scan()
takes 2.3%; handle_ordered_next() takes 12.3%.

Let's see unpatched stats for the same 100k SELECTs in stats_vanilla.txt

init_record_priority_queue() takes 0.1%; 

Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2020-12-22 Thread Sergei Golubchik
Hi, Aleksey!

On Nov 03, Aleksey Midenkov wrote:
> 
> the updated patches are here:
> 
> https://github.com/MariaDB/server/commits/5f6bcd6cd0145513974b0dfc5b2cefba28b72f1a

I started writing my comments there, but realized that it'd be clearer
if I reply here, more context. That is, see below.

One comment about the new test case, you have 16K lines in the result
file, are all 16K necessary? Can you do something like

   select x, left(b,10), left(v,10)

?

And please, pretty please, don't use non-constant references as function
arguments.

> On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik  wrote:
> >
> > > diff --git a/sql/field.cc b/sql/field.cc
> > > index bdaaecc20269..0fd40c979d2c 100644
> > > --- a/sql/field.cc
> > > +++ b/sql/field.cc
> > > @@ -8310,6 +8310,7 @@ int Field_blob::store(const char *from,uint 
> > > length,CHARSET_INFO *cs)
> > >copy_length= copier.well_formed_copy(field_charset,
> > > (char*) value.ptr(), new_length,
> > > cs, from, length);
> > > +  value.length(copy_length);
> >
> > good! Could this be a distinct bug with its own test case?
> 
> Do we need to spend time on that bureaucracy? There are real problems to
> solve... Tickets drain time from the management people and generally from
> anyone who sees them.

I didn't mean opening an MDEV. Only writing a test case and extracting
it into a separate commit.

If this is a genuine bug fix and you don't create a test case - how can
you be sure the fix will stay? It may disappear in a merge or a rebase,
manually or automatically. Or in a refactoring that "looks fine, all
tests pass". If you can create a test case - please, do.

> > >Field_blob::store_length(copy_length);
> > >bmove(ptr+packlength,(uchar*) ,sizeof(char*));
> > >
> > > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > > index 9d82df0235c7..7c658082d397 100644
> > > --- a/sql/ha_partition.cc
> > > +++ b/sql/ha_partition.cc
> > > @@ -5125,14 +5128,14 @@ bool ha_partition::init_record_priority_queue()
> > >uint alloc_len;
> > >uint used_parts= bitmap_bits_set(_part_info->read_partitions);
> > >/* Allocate record buffer for each used partition. */
> > > -  m_priority_queue_rec_len= m_rec_length + PARTITION_BYTES_IN_POS;
> > > +  m_priority_queue_rec_len= m_rec_length + ORDERED_REC_OFFSET;
> > >if (!m_using_extended_keys)
> > >m_priority_queue_rec_len += m_file[0]->ref_length;
> > >alloc_len= used_parts * m_priority_queue_rec_len;
> > >/* Allocate a key for temporary use when setting up the scan. */
> > >alloc_len+= table_share->max_key_length;
> > >
> > > -  if (!(m_ordered_rec_buffer= (uchar*)my_malloc(alloc_len, MYF(MY_WME
> > > +  if (!(m_ordered_rec_buffer= (uchar*) alloc_root(_ordered_root, 
> > > alloc_len)))
> >
> > I don't see why you need a memroot here. memroot is needed when you
> > allocate later at difeerent points in time some initially unpredictable
> > amount of memory that has the same lifetime and needs to be
> > freed all at once.
> 
> I know... This is just another convenient way of allocation with presumably
> minimized number of mallocs (probably pre_alloc_size should be tweaked to a
> better value).

memroot wasn't created for that. There are approaches with much less
overhead

> >
> > But here it seems that you only allocate once. You could just make
> >
> >m_priority_queue_rec_len = m_rec_length + PARTITION_BYTES_IN_POS +
> >   table->s->blob_fields * sizeof(Ordered_blob_storage)
> 
> I don't wnat this data serialized. It is inconvenient to read the dump of
> the record. Why we should make it harder to code, read the code and debug
> while we can make it easier? If it is only for guaranteed 1 malloc, I don't
> think it's worth it.

What do you mean "serialized"?
What do you mean "read the dump"?

Note that you can also use my_multi_malloc, that allocates many buffers
together and hides address arithmetcs from the caller. Like

  my_multi_malloc(MYF(MY_WME),
 _ordered_rec_buffer, m_rec_length + PARTITION_BYTES_IN_POS,
 _storage, table->s->blob_fields * sizeof(Ordered_blob_storage),
 NULL)

(but with error checking). And then, perhaps, with placement new[]
you could construct all Ordered_blob_storage objects in one line.

> > > @@ -6178,7 +6203,11 @@ int ha_partition::handle_ordered_index_scan(uchar 
> > > *buf, bool reverse_order)
> > >*/
> > >error= file->read_range_first(m_start_key.key? _start_key: NULL,
> > >  end_range, eq_range, TRUE);
> > > -  memcpy(rec_buf_ptr, table->record[0], m_rec_length);
> > > +  if (!error)
> > > +  {
> > > +memcpy(rec_buf_ptr, table->record[0], m_rec_length);
> > > +  }
> >
> > did you have a bug because of that? something didn't work?
> 
> No, I just saw worthless memcpy and avoided it.

You're right that memcopy is unnecessary in case of an error.
But most of the time 

Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2020-11-06 Thread Aleksey Midenkov
Hi Sergei!

The updated patches are here:

https://github.com/MariaDB/server/commits/2c2e79487a9189e02fce8f884a9ab2b42b9aa333


On Tue, Nov 3, 2020 at 8:28 PM Aleksey Midenkov  wrote:
>
> Sergei,
>
> the updated patches are here:
>
>
https://github.com/MariaDB/server/commits/5f6bcd6cd0145513974b0dfc5b2cefba28b72f1a
>
> On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik  wrote:
> >
> > Hi, Aleksey!
> >
> > Don't be confused, the commit header and the comment come from your last
> > commit, but the diff below includes all three commits that mention
MDEV-18734.
> >
...
>
> > > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > > index 9d82df0235c7..7c658082d397 100644
> > > --- a/sql/ha_partition.cc
> > > +++ b/sql/ha_partition.cc
> > > @@ -5125,14 +5128,14 @@ bool
ha_partition::init_record_priority_queue()
> > >uint alloc_len;
> > >uint used_parts= bitmap_bits_set(_part_info->read_partitions);
> > >/* Allocate record buffer for each used partition. */
> > > -  m_priority_queue_rec_len= m_rec_length + PARTITION_BYTES_IN_POS;
> > > +  m_priority_queue_rec_len= m_rec_length + ORDERED_REC_OFFSET;
> > >if (!m_using_extended_keys)
> > >m_priority_queue_rec_len += m_file[0]->ref_length;
> > >alloc_len= used_parts * m_priority_queue_rec_len;
> > >/* Allocate a key for temporary use when setting up the scan. */
> > >alloc_len+= table_share->max_key_length;
> > >
> > > -  if (!(m_ordered_rec_buffer= (uchar*)my_malloc(alloc_len,
MYF(MY_WME
> > > +  if (!(m_ordered_rec_buffer= (uchar*) alloc_root(_ordered_root,
alloc_len)))
> >
> > I don't see why you need a memroot here. memroot is needed when you
> > allocate later at difeerent points in time some initially unpredictable
> > amount of memory that has the same lifetime and needs to be
> > freed all at once.
>
> I know... This is just another convenient way of allocation with
presumably minimized number of mallocs (probably pre_alloc_size should be
tweaked to a better value).
>
> >
> > But here it seems that you only allocate once. You could just make
> >
> >m_priority_queue_rec_len = m_rec_length + PARTITION_BYTES_IN_POS +
> >   table->s->blob_fields * sizeof(Ordered_blob_storage)
>
> I don't want this data serialized. It is inconvenient to read the dump of
the record. Why we should make it harder to code, read the code and debug
while we can make it easier? If it is only for guaranteed 1 malloc, I don't
think it's worth it.
>

Added another patch that does 1 malloc in m_ordered_root.


--
All the best,

Aleksey Midenkov
@midenok
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2020-11-03 Thread Aleksey Midenkov
Sergei,

the updated patches are here:

https://github.com/MariaDB/server/commits/5f6bcd6cd0145513974b0dfc5b2cefba28b72f1a

On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik  wrote:
>
> Hi, Aleksey!
>
> Don't be confused, the commit header and the comment come from your last
> commit, but the diff below includes all three commits that mention
MDEV-18734.
>
> On Nov 02, Aleksey Midenkov wrote:
> > revision-id: 2f9bbf392072 (mariadb-10.2.31-543-g2f9bbf392072)
> > parent(s): 1f4960e3f2ac
> > author: Aleksey Midenkov 
> > committer: Aleksey Midenkov 
> > timestamp: 2020-11-02 14:26:04 +0300
> > message:
> >
> > MDEV-18734 optimization by taking ownership via String::swap()
> >
> > ha_partition stores records in array of m_ordered_rec_buffer and uses
> > it for prio queue in ordered index scan. When the records are restored
> > from the array the blob buffers may be already freed or
> > rewritten. This can happen when blob buffers are cached in
> > Field_blob::value or Field_blob::read_value.
> >
> > Previous solution worked via copying blob buffer into
> > mem_root-allocated buffer.  That was not optimal as it requires more
> > memory and processing time.
> >
> > This patch avoids memory copy by taking temporary ownership of cached
> > blob buffers via String::swap().  It also fixes problem when
> > value.str_length is wrong after Field_blob::store().
>
>
> > diff --git a/mysql-test/suite/federated/federated_partition.test
b/mysql-test/suite/federated/federated_partition.test
> > index ef1e27ec5054..33ee025442f4 100644
> > --- a/mysql-test/suite/federated/federated_partition.test
> > +++ b/mysql-test/suite/federated/federated_partition.test
> > @@ -50,4 +50,30 @@ drop table federated.t1_2;
> >
> >  --echo End of 5.1 tests
> >
> > +--echo #
> > +--echo # MDEV-18734 ASAN heap-use-after-free upon sorting by blob
column from partitioned table
> > +--echo #
> > +connection slave;
> > +use federated;
> > +create table t1_1 (x int, b text, key(x));
> > +create table t1_2 (x int, b text, key(x));
> > +connection master;
> > +--replace_result $SLAVE_MYPORT SLAVE_PORT
> > +eval create table t1 (x int, b text, key(x)) engine=federated
> > +  partition by range columns (x) (
> > +  partition p1 values less than (40)
connection='mysql://root@127.0.0.1:$SLAVE_MYPORT/federated/t1_1',
> > +  partition pn values less than (maxvalue)
connection='mysql://root@127.0.0.1:$SLAVE_MYPORT/federated/t1_2'
> > +);
> > +insert t1 values (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7,
7), (8, 8);
> > +insert t1 select x + 8, x + 8 from t1;
> > +insert t1 select x + 16, x + 16 from t1;
> > +insert t1 select x + 49, repeat(x + 49, 100) from t1;
> > +--echo # This produces wrong result before MDEV-17573
> > +select * from t1;
>
> What do you mean, simple `select * from t1` produces incorrect result?

Removed that garbage line.

>
> > +flush tables;
> > +select x, b from t1 where x > 30 and x < 60 order by b;
>
> Is this ASAN-only test?

No, it produces wrong result in 10.2 for me.

>
> > +drop table t1;
> > +connection slave;
> > +drop table t1_1, t1_2;
> > +
> >  source include/federated_cleanup.inc;
> > diff --git a/mysql-test/suite/vcol/t/partition.test
b/mysql-test/suite/vcol/t/partition.test
> > index 889724fb1c5c..f10ee0491ccc 100644
> > --- a/mysql-test/suite/vcol/t/partition.test
> > +++ b/mysql-test/suite/vcol/t/partition.test
> > @@ -30,3 +30,28 @@ subpartition by hash(v) subpartitions 3 (
> >  insert t1 set i= 0;
> >  set statement sql_mode= '' for update t1 set i= 1, v= 2;
> >  drop table t1;
> > +
> > +--echo #
> > +--echo # MDEV-18734 ASAN heap-use-after-free in
my_strnxfrm_simple_internal upon update on versioned partitioned table
> > +--echo #
> > +--echo # Cover queue_fix() in ha_partition::handle_ordered_index_scan()
> > +create table t1 (
> > +x int auto_increment primary key,
> > +b text, v mediumtext as (b) virtual,
> > +index (v(10))
> > +) partition by range columns (x) (
> > +partition p1 values less than (3),
> > +partition p2 values less than (6),
> > +partition pn values less than (maxvalue));
> > +insert into t1 (b) values ('q'), ('a'), ('b');
> > +update t1 set b= 'bar' where v > 'a';
> > +drop table t1;
>
> This test didn't fail for me in an ASAN build
> on the vanilla 10.5 without your fix.
> That is, I'm not sure it actually tests the fix.

It failed on 10.3 for me some time ago. Ok, replaced that with the bigger
version which fails on non-ASAN optimized all right.

>
> > +
> > +--echo # Cover return_top_record() in
ha_partition::handle_ordered_index_scan()
> > +create table t1 (x int primary key, b tinytext, v text as (b) virtual)
> > +partition by range columns (x) (
> > +  partition p1 values less than (4),
> > +  partition pn values less than (maxvalue));
> > +insert into t1 (x, b) values (1, ''), (2, ''), (3, 'a'), (4, 'b');
> > +update t1 set b= 'bar' where x > 0 order by v limit 2;
>
> This fails without a fix all right.
>
> Still, I wonder whether it's possible to create a 

Re: [Maria-developers] 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

2020-11-02 Thread Sergei Golubchik
Hi, Aleksey!

Don't be confused, the commit header and the comment come from your last
commit, but the diff below includes all three commits that mention MDEV-18734.

On Nov 02, Aleksey Midenkov wrote:
> revision-id: 2f9bbf392072 (mariadb-10.2.31-543-g2f9bbf392072)
> parent(s): 1f4960e3f2ac
> author: Aleksey Midenkov 
> committer: Aleksey Midenkov 
> timestamp: 2020-11-02 14:26:04 +0300
> message:
> 
> MDEV-18734 optimization by taking ownership via String::swap()
> 
> ha_partition stores records in array of m_ordered_rec_buffer and uses
> it for prio queue in ordered index scan. When the records are restored
> from the array the blob buffers may be already freed or
> rewritten. This can happen when blob buffers are cached in
> Field_blob::value or Field_blob::read_value.
> 
> Previous solution worked via copying blob buffer into
> mem_root-allocated buffer.  That was not optimal as it requires more
> memory and processing time.
> 
> This patch avoids memory copy by taking temporary ownership of cached
> blob buffers via String::swap().  It also fixes problem when
> value.str_length is wrong after Field_blob::store().


> diff --git a/mysql-test/suite/federated/federated_partition.test 
> b/mysql-test/suite/federated/federated_partition.test
> index ef1e27ec5054..33ee025442f4 100644
> --- a/mysql-test/suite/federated/federated_partition.test
> +++ b/mysql-test/suite/federated/federated_partition.test
> @@ -50,4 +50,30 @@ drop table federated.t1_2;
>  
>  --echo End of 5.1 tests
>  
> +--echo #
> +--echo # MDEV-18734 ASAN heap-use-after-free upon sorting by blob column 
> from partitioned table
> +--echo #
> +connection slave;
> +use federated;
> +create table t1_1 (x int, b text, key(x));
> +create table t1_2 (x int, b text, key(x));
> +connection master;
> +--replace_result $SLAVE_MYPORT SLAVE_PORT
> +eval create table t1 (x int, b text, key(x)) engine=federated
> +  partition by range columns (x) (
> +  partition p1 values less than (40) 
> connection='mysql://root@127.0.0.1:$SLAVE_MYPORT/federated/t1_1',
> +  partition pn values less than (maxvalue) 
> connection='mysql://root@127.0.0.1:$SLAVE_MYPORT/federated/t1_2'
> +);
> +insert t1 values (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 
> 8);
> +insert t1 select x + 8, x + 8 from t1;
> +insert t1 select x + 16, x + 16 from t1;
> +insert t1 select x + 49, repeat(x + 49, 100) from t1;
> +--echo # This produces wrong result before MDEV-17573
> +select * from t1;

What do you mean, simple `select * from t1` produces incorrect result?

> +flush tables;
> +select x, b from t1 where x > 30 and x < 60 order by b;

Is this ASAN-only test?

> +drop table t1;
> +connection slave;
> +drop table t1_1, t1_2;
> +
>  source include/federated_cleanup.inc;
> diff --git a/mysql-test/suite/vcol/t/partition.test 
> b/mysql-test/suite/vcol/t/partition.test
> index 889724fb1c5c..f10ee0491ccc 100644
> --- a/mysql-test/suite/vcol/t/partition.test
> +++ b/mysql-test/suite/vcol/t/partition.test
> @@ -30,3 +30,28 @@ subpartition by hash(v) subpartitions 3 (
>  insert t1 set i= 0;
>  set statement sql_mode= '' for update t1 set i= 1, v= 2;
>  drop table t1;
> +
> +--echo #
> +--echo # MDEV-18734 ASAN heap-use-after-free in my_strnxfrm_simple_internal 
> upon update on versioned partitioned table
> +--echo #
> +--echo # Cover queue_fix() in ha_partition::handle_ordered_index_scan()
> +create table t1 (
> +x int auto_increment primary key,
> +b text, v mediumtext as (b) virtual,
> +index (v(10))
> +) partition by range columns (x) (
> +partition p1 values less than (3),
> +partition p2 values less than (6),
> +partition pn values less than (maxvalue));
> +insert into t1 (b) values ('q'), ('a'), ('b');
> +update t1 set b= 'bar' where v > 'a';
> +drop table t1;

This test didn't fail for me in an ASAN build
on the vanilla 10.5 without your fix.
That is, I'm not sure it actually tests the fix.

> +
> +--echo # Cover return_top_record() in 
> ha_partition::handle_ordered_index_scan()
> +create table t1 (x int primary key, b tinytext, v text as (b) virtual)
> +partition by range columns (x) (
> +  partition p1 values less than (4),
> +  partition pn values less than (maxvalue));
> +insert into t1 (x, b) values (1, ''), (2, ''), (3, 'a'), (4, 'b');
> +update t1 set b= 'bar' where x > 0 order by v limit 2;

This fails without a fix all right.

Still, I wonder whether it's possible to create a test that'll fail
in a normal optimized build w/o ASAN

> +drop table t1;
> diff --git a/sql/field.cc b/sql/field.cc
> index bdaaecc20269..0fd40c979d2c 100644
> --- a/sql/field.cc
> +++ b/sql/field.cc
> @@ -8310,6 +8310,7 @@ int Field_blob::store(const char *from,uint 
> length,CHARSET_INFO *cs)
>copy_length= copier.well_formed_copy(field_charset,
> (char*) value.ptr(), new_length,
> cs, from, length);
> +  value.length(copy_length);

good! Could this be a distinct bug with its