Re: optimize lookups in snapshot [sub]xip arrays

2022-08-10 Thread Nathan Bossart
On Thu, Aug 11, 2022 at 09:50:54AM +0700, John Naylor wrote: > I was waiting for all the Windows animals to report in, and it looks > like they have, so I've pushed 0002. Thanks for picking this topic up > again! Thanks for reviewing and committing. -- Nathan Bossart Amazon Web Services:

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-10 Thread John Naylor
On Thu, Aug 11, 2022 at 4:46 AM Nathan Bossart wrote: > > On Wed, Aug 10, 2022 at 10:50:02AM +0700, John Naylor wrote: > > LGTM, let's see what the buildfarm thinks of 0001. > > Thanks! I haven't noticed any related buildfarm failures yet. I was waiting for all the Windows animals to report in,

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-10 Thread Nathan Bossart
On Wed, Aug 10, 2022 at 10:50:02AM +0700, John Naylor wrote: > LGTM, let's see what the buildfarm thinks of 0001. Thanks! I haven't noticed any related buildfarm failures yet. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-09 Thread John Naylor
On Wed, Aug 10, 2022 at 7:13 AM Nathan Bossart wrote: > > On Tue, Aug 09, 2022 at 01:00:37PM -0700, Nathan Bossart wrote: > > Your adjustments in 0002 seem reasonable to me. I think it makes sense to > > ensure there is test coverage for pg_lfind32(), but I don't know if that > > syscache code

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-09 Thread Masahiko Sawada
On Wed, Aug 10, 2022 at 5:00 AM Nathan Bossart wrote: > > On Tue, Aug 09, 2022 at 01:21:41PM +0700, John Naylor wrote: > > I decided I wasn't quite comfortable changing snapshot handling > > without further guarantees. To this end, 0002 in the attached v11 is > > an addendum that adds assert

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-09 Thread Nathan Bossart
On Tue, Aug 09, 2022 at 01:00:37PM -0700, Nathan Bossart wrote: > Your adjustments in 0002 seem reasonable to me. I think it makes sense to > ensure there is test coverage for pg_lfind32(), but I don't know if that > syscache code is the right choice. For non-USE_SSE2 builds, it might make >

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-09 Thread Nathan Bossart
On Tue, Aug 09, 2022 at 01:21:41PM +0700, John Naylor wrote: > I decided I wasn't quite comfortable changing snapshot handling > without further guarantees. To this end, 0002 in the attached v11 is > an addendum that adds assert checking (also pgindent and some > comment-smithing). As I

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-09 Thread John Naylor
On Tue, Aug 9, 2022 at 10:51 AM Nathan Bossart wrote: > Fixed in v10. I decided I wasn't quite comfortable changing snapshot handling without further guarantees. To this end, 0002 in the attached v11 is an addendum that adds assert checking (also pgindent and some comment-smithing). As I

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Tom Lane
Nathan Bossart writes: > On Tue, Aug 09, 2022 at 09:40:15AM +0530, Bharath Rupireddy wrote: >> Isn't it a good idea to capture the above two points as comments in >> port/pg_lfind.h just to not lose track of it? I know these are present >> in the hackers thread, but having them in the form of

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Nathan Bossart
On Tue, Aug 09, 2022 at 09:40:15AM +0530, Bharath Rupireddy wrote: > Isn't it a good idea to capture the above two points as comments in > port/pg_lfind.h just to not lose track of it? I know these are present > in the hackers thread, but having them in the form of comments helps > developers who

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Bharath Rupireddy
On Tue, Aug 9, 2022 at 4:37 AM Nathan Bossart wrote: > > On Mon, Aug 08, 2022 at 12:56:28PM +0530, Bharath Rupireddy wrote: > > 1) pg_lfind32 - why just uint32? If it's not possible to define > > functions for char, unsigned char, int16, uint16, int32, int64, uint64 > > and so on, can we add a

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Nathan Bossart
On Tue, Aug 09, 2022 at 10:57:44AM +0900, Masahiko Sawada wrote: > The patch looks good to me. One minor point is: Thanks for taking a look. > + * IDENTIFICATION > + * src/port/pg_lfind.h > > The path doesn't match to the actual file path, src/include/port/pg_lfind.h. Fixed in v10. --

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Masahiko Sawada
On Tue, Aug 9, 2022 at 7:33 AM Nathan Bossart wrote: > > On Mon, Aug 08, 2022 at 01:46:48PM +0700, John Naylor wrote: > > Okay, I think it's basically in good shape. Since it should be a bit > > faster than a couple versions ago, would you be up for retesting with > > the original test having 8

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Nathan Bossart
On Mon, Aug 08, 2022 at 12:56:28PM +0530, Bharath Rupireddy wrote: > 1) pg_lfind32 - why just uint32? If it's not possible to define > functions for char, unsigned char, int16, uint16, int32, int64, uint64 > and so on, can we add a few comments around that? Also, the comments > can talk about if

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Nathan Bossart
On Mon, Aug 08, 2022 at 01:46:48PM +0700, John Naylor wrote: > Okay, I think it's basically in good shape. Since it should be a bit > faster than a couple versions ago, would you be up for retesting with > the original test having 8 to 512 writers? Sure thing. The results are similar. As

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Bharath Rupireddy
On Mon, Aug 8, 2022 at 2:30 PM John Naylor wrote: > > On Mon, Aug 8, 2022 at 2:26 PM Bharath Rupireddy > wrote: > > > 3) Can pg_lfind32 return the index of the key found, for instance to > > use it for setting/resetting the found element in the array? > > That was just discussed. It's slightly

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread John Naylor
On Mon, Aug 8, 2022 at 2:26 PM Bharath Rupireddy wrote: > > > 1) pg_lfind32 - why just uint32? If it's not possible to define > functions for char, unsigned char, int16, uint16, int32, int64, uint64 > and so on, can we add a few comments around that? Also, the comments Future work, as far as I'm

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread Bharath Rupireddy
On Mon, Aug 8, 2022 at 12:17 PM John Naylor wrote: > > On Sun, Aug 7, 2022 at 4:25 AM Nathan Bossart > wrote: > > > > [v8] > > Okay, I think it's basically in good shape. Since it should be a bit > faster than a couple versions ago, would you be up for retesting with > the original test having

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-08 Thread John Naylor
On Sun, Aug 7, 2022 at 4:25 AM Nathan Bossart wrote: > > [v8] Okay, I think it's basically in good shape. Since it should be a bit faster than a couple versions ago, would you be up for retesting with the original test having 8 to 512 writers? And also add the const markers we discussed

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-06 Thread Nathan Bossart
On Sat, Aug 06, 2022 at 11:13:26AM -0700, Nathan Bossart wrote: > On Fri, Aug 05, 2022 at 03:04:34PM -0700, Andres Freund wrote: >> But mainly I'd expect to find a difference if the SIMD code were optimized a >> further on the basis of not needing to return the offset. E.g. by >> replacing

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-06 Thread Nathan Bossart
On Fri, Aug 05, 2022 at 03:04:34PM -0700, Andres Freund wrote: > But mainly I'd expect to find a difference if the SIMD code were optimized a > further on the basis of not needing to return the offset. E.g. by > replacing _mm_packs_epi32 with _mm_or_si128, that's cheaper. I haven't been able to

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-05 Thread Andres Freund
Hi, On 2022-08-05 13:25:10 -0700, Nathan Bossart wrote: > I went ahead and renamed it to pg_lfind32() and switched it back to > returning the pointer. That felt the cleanest from the naming perspective, > but as Andres noted, it might not be as fast as just looking for the > presence of the

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-05 Thread Nathan Bossart
On Fri, Aug 05, 2022 at 11:02:15AM +0700, John Naylor wrote: > That is a good point. Maybe potential helpers in simd.h should only deal > specifically with vector registers, with it's users providing C fallbacks. > I don't have any good ideas of where to put the new function, though. I moved it

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-04 Thread John Naylor
On Fri, Aug 5, 2022 at 5:15 AM Nathan Bossart wrote: > > On Thu, Aug 04, 2022 at 02:58:14PM +0700, John Naylor wrote: > > Were you considering adding the new function to simd.h now that that's > > committed? It's a bit up in the air what should go in there, but this new > > function is low-level

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-04 Thread Nathan Bossart
On Thu, Aug 04, 2022 at 02:58:14PM +0700, John Naylor wrote: > Were you considering adding the new function to simd.h now that that's > committed? It's a bit up in the air what should go in there, but this new > function is low-level and generic enough to be a candidate... I don't have a strong

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-04 Thread John Naylor
On Thu, Aug 4, 2022 at 3:25 AM Nathan Bossart wrote: > Here is a new patch set without the annotation. Were you considering adding the new function to simd.h now that that's committed? It's a bit up in the air what should go in there, but this new function is low-level and generic enough to be a

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-03 Thread Nathan Bossart
On Wed, Aug 03, 2022 at 11:06:58AM -0700, Andres Freund wrote: > On 2022-08-02 16:43:57 -0700, Nathan Bossart wrote: >> >> +#ifdef USE_SSE2 >> >> +pg_attribute_no_sanitize_alignment() >> >> +#endif >> > >> > What's the deal with this annotation? Needs a comment. >> >> Will do. c.h suggests that

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-03 Thread Andres Freund
Hi, On 2022-08-02 16:43:57 -0700, Nathan Bossart wrote: > >> +/* > >> + * pg_linearsearch_uint32 > >> + * > >> + * Returns the address of the first element in 'base' that equals 'key', > >> or > >> + * NULL if no match is found. > >> + */ > >> +#ifdef USE_SSE2 > >>

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-03 Thread Nathan Bossart
Here is a new patch set. 0001 is the currently-proposed patch from the other thread [0] for determining SSE2 support. 0002 introduces the optimized linear search function. And 0003 makes use of the new function for the [sub]xip lookups in XidInMVCCSnapshot(). [0]

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-02 Thread John Naylor
On Wed, Aug 3, 2022 at 6:43 AM Nathan Bossart wrote: > Just under half of the callers in 0002 require the offset, but I don't know > if any of those are worth optimizing in the first place. I'll change it > for now. It's easy enough to add it back in the future if required. Yeah, some of those

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-02 Thread Nathan Bossart
On Tue, Aug 02, 2022 at 03:55:39PM -0700, Andres Freund wrote: > FWIW, I'd split the introduction of the helper and the use of it in snapmgr > into separate patches. Will do. > On 2022-08-02 15:13:01 -0700, Nathan Bossart wrote: >> diff --git a/src/include/c.h b/src/include/c.h >> index

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-02 Thread Andres Freund
Hi, FWIW, I'd split the introduction of the helper and the use of it in snapmgr into separate patches. On 2022-08-02 15:13:01 -0700, Nathan Bossart wrote: > diff --git a/src/include/c.h b/src/include/c.h > index d35405f191..2c1a47bc28 100644 > --- a/src/include/c.h > +++ b/src/include/c.h > @@

Re: optimize lookups in snapshot [sub]xip arrays

2022-08-02 Thread Nathan Bossart
On Fri, Jul 29, 2022 at 10:38:11PM -0700, Nathan Bossart wrote: > On Sat, Jul 30, 2022 at 12:02:02PM +0700, John Naylor wrote: >> I'll be the first to say it's not committable and needs some thought. Since >> there are several recently proposed patches that take advantage of SSE2, it >> seems time

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-29 Thread Nathan Bossart
On Sat, Jul 30, 2022 at 12:02:02PM +0700, John Naylor wrote: > On Fri, Jul 29, 2022 at 4:34 AM Nathan Bossart > wrote: >> * I borrowed USE_SSE2 from one of John Naylor's patches [0]. I'm not > sure >>whether this is committable, > > I'll be the first to say it's not committable and needs

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-29 Thread John Naylor
On Fri, Jul 29, 2022 at 4:34 AM Nathan Bossart wrote: > * I briefly looked into seeing whether auto-vectorization was viable and >concluded it was not for these loops. > > * I borrowed USE_SSE2 from one of John Naylor's patches [0]. I'm not sure >whether this is committable, I'll be

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-28 Thread Nathan Bossart
On Tue, Jul 26, 2022 at 11:19:06AM -0700, Andres Freund wrote: > On 2022-07-25 12:04:19 -0700, Nathan Bossart wrote: >> From the discussion thus far, it seems there is interest in optimizing >> [sub]xip lookups, so I'd like to spend some time moving it forward. I >> think the biggest open

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-26 Thread Andres Freund
On 2022-07-25 12:04:19 -0700, Nathan Bossart wrote: > On Sat, Jul 16, 2022 at 08:59:57PM -0700, Nathan Bossart wrote: > > On Fri, Jul 15, 2022 at 01:08:57PM -0700, Andres Freund wrote: > >> I wonder if we additionally / alternatively could use a faster method of > >> searching the array linearly,

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-25 Thread Zhang Mingli
> On Jul 26, 2022, at 03:04, Nathan Bossart wrote: >> > From the discussion thus far, it seems there is interest in optimizing > [sub]xip lookups, so I'd like to spend some time moving it forward. I > think the biggest open question is which approach to take. Both the SIMD > and hash table

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-25 Thread Nathan Bossart
On Sat, Jul 16, 2022 at 08:59:57PM -0700, Nathan Bossart wrote: > On Fri, Jul 15, 2022 at 01:08:57PM -0700, Andres Freund wrote: >> I wonder if we additionally / alternatively could use a faster method of >> searching the array linearly, e.g. using SIMD. > > I looked into using SIMD. The patch

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-24 Thread Zhang Mingli
Got it, thanks. Regards, Zhang Mingli > On Jul 25, 2022, at 12:08, Nathan Bossart wrote: > > On Sun, Jul 24, 2022 at 12:48:25PM +0800, Zhang Mingli wrote: >> If snaphost->suboverflowed is false then the subxcnt must be less than >> PGPROC_MAX_CACHED_SUBXIDS which is 64 now. >> >> And we

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-24 Thread Nathan Bossart
On Sun, Jul 24, 2022 at 12:48:25PM +0800, Zhang Mingli wrote: > If snaphost->suboverflowed is false then the subxcnt must be less than > PGPROC_MAX_CACHED_SUBXIDS which is 64 now. > > And we won’t use hash if the xcnt is less than XIP_HASH_MIN_ELEMENTS which is > 128 currently during

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-24 Thread Yura Sokolov
В Ср, 13/07/2022 в 10:09 -0700, Nathan Bossart пишет: > Hi hackers, > > A few years ago, there was a proposal to create hash tables for long > [sub]xip arrays in snapshots [0], but the thread seems to have fizzled out. > I was curious whether this idea still showed measurable benefits, so I >

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-23 Thread Zhang Mingli
Hi, all > > if (!snapshot->suboverflowed) > { > /* we have full data, so search subxip */ > - int32 j; > - > - for (j = 0; j < snapshot->subxcnt; j++) > - { > -

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-16 Thread Nathan Bossart
Hi Andres, Thanks for taking a look. On Fri, Jul 15, 2022 at 01:08:57PM -0700, Andres Freund wrote: > Hm. Are there any contexts where we'd not want the potential for failing due > to OOM? I'm not sure about this one. > I wonder if we additionally / alternatively could use a faster method of >

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-15 Thread Andres Freund
Hi, Sounds worth pursuing. On 2022-07-13 10:09:50 -0700, Nathan Bossart wrote: > The attached patch has some key differences from the previous proposal. > For example, the new patch uses simplehash instead of open-coding a new > hash table. +1 > The attached patch waits until a lookup of

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-14 Thread Nathan Bossart
Hi Bharath, Thanks for taking a look. On Thu, Jul 14, 2022 at 03:10:56PM +0530, Bharath Rupireddy wrote: > Aren't these snapshot arrays always sorted? I see the following code: > > /* sort so we can bsearch() */ > qsort(snapshot->xip, snapshot->xcnt, sizeof(TransactionId), xidComparator); > >

Re: optimize lookups in snapshot [sub]xip arrays

2022-07-14 Thread Bharath Rupireddy
On Wed, Jul 13, 2022 at 10:40 PM Nathan Bossart wrote: > > Hi hackers, > > A few years ago, there was a proposal to create hash tables for long > [sub]xip arrays in snapshots [0], but the thread seems to have fizzled out. > I was curious whether this idea still showed measurable benefits, so I >

optimize lookups in snapshot [sub]xip arrays

2022-07-13 Thread Nathan Bossart
Hi hackers, A few years ago, there was a proposal to create hash tables for long [sub]xip arrays in snapshots [0], but the thread seems to have fizzled out. I was curious whether this idea still showed measurable benefits, so I revamped the patch and ran the same test as before [1]. Here are the