Re: JIT compiling with LLVM v9.0

2018-02-11 Thread Merlin Moncure
On Thu, Jan 25, 2018 at 9:40 AM, Konstantin Knizhnik
 wrote:
> As far as I understand generation of native code is now always done for all
> supported expressions and individually by each backend.
> I wonder it will be useful to do more efforts to understand when compilation
> to native code should be done and when interpretation is better.
> For example many JIT-able languages like Lua are using traces, i.e. query is
> first interpreted  and trace is generated. If the same trace is followed
> more than N times, then native code is generated for it.
>
> In context of DBMS executor it is obvious that only frequently executed or
> expensive queries have to be compiled.
> So we can use estimated plan cost and number of query executions as simple
> criteria for JIT-ing the query.
> May be compilation of simple queries (with small cost) should be done only
> for prepared statements...
>
> Another question is whether it is sensible to redundantly do expensive work
> (llvm compilation) in all backends.
> This question refers to shared prepared statement cache. But even without
> such cache, it seems to be possible to use for library name some signature
> of the compiled expression and allow
> to share this libraries between backends. So before starting code
> generation, ExecReadyCompiledExpr can first build signature and check if
> correspondent library is already present.
> Also it will be easier to control space used by compiled libraries in this
> case.

Totally agree; these considerations are very important.

I tested several queries in my application that had >30 second compile
times against a one second run time,.  Not being able to manage when
compilation happens is making it difficult to get a sense of llvm
performance in the general case.  Having explain analyze print compile
time and being able to prepare llvm compiled queries ought to help
measurement and tuning.  There may be utility here beyond large
analytical queries as the ability to optimize spreads through the
executor with the right trade off management.

This work is very exciting...thank you.

merlin



Re: JIT compiling with LLVM v9.0

2018-02-10 Thread Peter Geoghegan
On Wed, Jan 31, 2018 at 8:53 AM, Robert Haas  wrote:
> As far as the second one, looking back at what happened with parallel
> query, I found (on a quick read) 13 back-patched commits in
> REL9_6_STABLE prior to the release of 10.0, 3 of which I would qualify
> as low-importance (improving documentation, fixing something that's
> not really a bug, improving a test case).  A couple of those were
> really stupid mistakes on my part.  On the other hand, would it have
> been overall worse for our users if that feature had been turned on in
> 9.6?  I don't know.  They would have had those bugs (at least until we
> fixed them) but they would have had parallel query, too.  It's hard
> for me to judge whether that was a win or a loss, and so here.  Like
> parallel query, this is a feature which seems to have a low risk of
> data corruption, but a fairly high risk of wrong answers to queries
> and/or strange errors.   Users don't like that.  On the other hand,
> also like parallel query, if you've got the right kind of queries, it
> can make them go a lot faster.  Users DO like that.

As a data point, I can tell you that Heroku enabled parallel query for
9.6 immediately, and it turned out fine. The first version available
as stable was probably 9.6.3 -- there or thereabouts.

There were some bugs, of course, but not to the extent that 9.6 was
looked upon as being more buggy than the average Postgres release.

-- 
Peter Geoghegan



Re: JIT compiling with LLVM v9.0

2018-02-09 Thread Andres Freund
On 2018-02-09 09:10:25 -0600, Merlin Moncure wrote:
> Question:  when watching the compilation log, I see quite a few files
> being compiled with both O2 and O1, for example:
> 
> clang -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement -Wendif-labels
> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
> -fwrapv -Wno-unused-command-line-argument -O2 -O1
> -Wno-ignored-attributes -Wno-unknown-warning-option
> -Wno-ignored-optimization-argument -I../../../../src/include
> -D_GNU_SOURCE -I/home/mmoncure/llvm/include -DLLVM_BUILD_GLOBAL_ISEL
> -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS
> -D__STDC_LIMIT_MACROS  -flto=thin -emit-llvm -c -o nbtsort.bc
> nbtsort.c
> 
> Is this intentional?  (didn't check standard compilation, it just jumped out).

It stemms from the following hunk in Makefile.global.in about emitting
bitcode:
# Add -O1 to the options as clang otherwise will emit 'noinline'
# attributes everywhere, making JIT inlining impossible to test in a
# debugging build.
#
# FIXME: While LLVM will re-optimize when emitting code (after
# inlining), it'd be better to only do this if -O0 is specified.
%.bc : CFLAGS +=-O1

%.bc : %.c
$(COMPILE.c.bc) -o $@ $<

Inspecting the clang source code it's impossible to stop clang from
emitting noinline attributes for every function on -O0.

I think it makes sense to change this to filtering out -O0 and only
adding -O1 if that's not present. :/

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-02-09 Thread Merlin Moncure
On Thu, Feb 1, 2018 at 8:16 PM, Thomas Munro
 wrote:
> On Fri, Feb 2, 2018 at 2:05 PM, Andres Freund  wrote:
>> On 2018-02-01 09:32:17 -0800, Jeff Davis wrote:
>>> On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
>>>  wrote:
>>> > The same problem takes place with old versions of GCC: I have to upgrade 
>>> > GCC
>>> > to 7.2 to make it possible to compile this code.
>>> > The problem in not in compiler itself, but in libc++ headers.
>>>
>>> How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
>>> and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
>>> CXXFLAGS required?
>>
>> Just to understand: You're running in the issue with the header being
>> included from within the extern "C" {}?  Hm, I've pushed a quick fix for
>> that.
>
> That change wasn't quite enough: to get this building against libc++
> (Clang's native stdlb) I also needed this change to llvmjit.h so that
>  wouldn't be included with the wrong linkage (perhaps
> you can find a less ugly way):
>
> +#ifdef __cplusplus
> +}
> +#endif
>  #include 
> +#ifdef __cplusplus
> +extern "C"
> +{
> +#endif

This did the trick -- thanks.  Sitting through 20 minute computer
crashing link times really brings back C++ nightmares -- if anyone
else needs to compile llvm/clang as I did (I'm stuck on 3.2 with my
aging mint box), I strongly encourage you to use the gold linker.

Question:  when watching the compilation log, I see quite a few files
being compiled with both O2 and O1, for example:

clang -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -Wno-unused-command-line-argument -O2 -O1
-Wno-ignored-attributes -Wno-unknown-warning-option
-Wno-ignored-optimization-argument -I../../../../src/include
-D_GNU_SOURCE -I/home/mmoncure/llvm/include -DLLVM_BUILD_GLOBAL_ISEL
-D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS
-D__STDC_LIMIT_MACROS  -flto=thin -emit-llvm -c -o nbtsort.bc
nbtsort.c

Is this intentional?  (didn't check standard compilation, it just jumped out).

merlin



Re: JIT compiling with LLVM v9.0

2018-02-02 Thread Andres Freund
Hi,

On 2018-02-02 18:22:34 +1300, Thomas Munro wrote:
> The clang that was used for bitcode was the system /usr/bin/clang,
> version 4.0.  Is it a problem that I used that for compiling the
> bitcode, but LLVM5 for JIT?  I actually tried
> CLANG=/usr/local/llvm50/bin/clang but ran into weird failures I
> haven't got to the bottom of at ThinLink time so I couldn't get as far
> as a running system.

You're using thinlto to compile pg? Could you provide what you pass to
configure for that? IIRC I tried that a while ago and ran into some
issues with us creating archives (libpgport, libpgcommon).

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-02-02 Thread Andres Freund
On 2018-02-01 22:20:01 -0800, Jeff Davis wrote:
> Thanks! That worked, but I had to remove the "-stdlib=libc++" also,
> which was causing me problems.

That'll be gone as soon as I finish the shlib thing. Will hope to have
something over the weekend. Right now I'm at FOSDEM and need to prepare
a talk for tomorrow.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-02-02 Thread Andres Freund
Hi,

On 2018-02-02 18:22:34 +1300, Thomas Munro wrote:
> Is there something broken about my installation?  I see simple
> arithmetic expressions apparently compiling and working but I can
> easily find stuff that breaks... so far I think it's anything
> involving string literals:

That definitely should all work. Did you compile with lto and forced it
to internalize all symbols or such?


> postgres=# set jit_above_cost = 0;
> SET
> postgres=# select quote_ident('x');
> ERROR:  failed to resolve name MakeExpandedObjectReadOnlyInternal

...

> The clang that was used for bitcode was the system /usr/bin/clang,
> version 4.0.  Is it a problem that I used that for compiling the
> bitcode, but LLVM5 for JIT?

No, I did that locally without problems.


> I actually tried CLANG=/usr/local/llvm50/bin/clang but ran into weird
> failures I haven't got to the bottom of at ThinLink time so I couldn't
> get as far as a running system.

So you'd clang 5 level issues rather than with this patchset, do I
understand correctly?


> I installed llvm50 from a package.  I did need to make a tiny tweak by
> hand: in src/Makefile.global, llvm-config --system-libs had said
> -l/usr/lib/libexecinfo.so which wasn't linking and looks wrong to me
> so I changed it to -lexecinfo, noted that it worked and reported a bug
> upstream:  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225621

Yea, that seems outside of my / our hands.

- Andres



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Jeff Davis
On Thu, Feb 1, 2018 at 10:09 PM, Thomas Munro
 wrote:
> On Fri, Feb 2, 2018 at 7:06 PM, Jeff Davis  wrote:
>> /usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
>> function ‘long double
>> ...
>> /usr/include/c++/5/cmath:926:3: error: template with C linkage
>
> I suspect you can fix these with this change:
>
> +#ifdef __cplusplus
> +}
> +#endif
>  #include 
> +#ifdef __cplusplus
> +extern "C"
> +{
> +#endif
>
> ... in llvmjit.h.

Thanks! That worked, but I had to remove the "-stdlib=libc++" also,
which was causing me problems.

Regards,
Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Thomas Munro
On Fri, Feb 2, 2018 at 7:06 PM, Jeff Davis  wrote:
> /usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
> function ‘long double
> ...
> /usr/include/c++/5/cmath:926:3: error: template with C linkage

I suspect you can fix these with this change:

+#ifdef __cplusplus
+}
+#endif
 #include 
+#ifdef __cplusplus
+extern "C"
+{
+#endif

... in llvmjit.h.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Jeff Davis
On Thu, Feb 1, 2018 at 5:05 PM, Andres Freund  wrote:
> Just to understand: You're running in the issue with the header being
> included from within the extern "C" {}?  Hm, I've pushed a quick fix for
> that.
>
> Other than that, you can compile with both gcc or clang, but clang needs
> to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
> (in that order) exist, similar with llvm-config llvm-config-5.0 being
> guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
> both of that. E.g.
> ./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
> is what I use, although I also add:
> LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
> so I don't have to install llvm anywhere the system knows about.


On Ubuntu 16.04
SHA1: 302b7a284
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609

packages: llvm-5.0 llvm-5.0-dev llvm-5.0-runtime libllvm-5.0
clang-5.0 libclang-common-5.0-dev libclang1-5.0

./configure --with-llvm --prefix=/home/jdavis/install/pgsql-dev
...
checking for llvm-config... no
checking for llvm-config-5.0... llvm-config-5.0
checking for clang... no
checking for clang-5.0... clang-5.0
checking for LLVMOrcGetSymbolAddressIn... no
checking for LLVMGetHostCPUName... no
checking for LLVMOrcRegisterGDB... no
checking for LLVMOrcRegisterPerf... no
checking for LLVMOrcUnregisterPerf... no
...

That encounters errors like:

/usr/include/c++/5/bits/c++0x_warning.h:32:2: error: #error This file
requires compiler an
d library support for the ISO C++ 2011 standard. This support must be
enabled with the -st
d=c++11 or -std=gnu++11 compiler options.
...
/usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
function ‘long double
...
/usr/include/c++/5/cmath:926:3: error: template with C linkage
...

So I reconfigure with:
CXXFLAGS="-std=c++11" ./configure --with-llvm
--prefix=/home/jdavis/install/pgsql-dev

I think that got rid of the first error, but the other errors remain.

I also tried installing libc++-dev and using CC=clang-5.0
CXX=clang++-5.0 and with CXXFLAGS="-std=c++11 -stdlib=libc++" but I am
not making much progress, I'm still getting:

/usr/include/c++/v1/cmath:316:1: error: templates must have C++ linkage

I suggest that you share your exact configuration so we can get past
this for now, and you can work on the build issues in the background.
We can't be the first ones with this problem; maybe you can just ask
on an LLVM channel what the right thing to do is that will work on a
variety of machines (or at least reliably detect the problem at
configure time)?

Regards,
Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Thomas Munro
On Fri, Feb 2, 2018 at 5:11 PM, Thomas Munro
 wrote:
> Another small thing which might be environmental... llvmjit_types.bc
> is getting installed into ${prefix}/lib here, but you're looking for
> it in ${prefix}/lib/postgresql:

Is there something broken about my installation?  I see simple
arithmetic expressions apparently compiling and working but I can
easily find stuff that breaks... so far I think it's anything
involving string literals:

postgres=# set jit_above_cost = 0;
SET
postgres=# select quote_ident('x');
ERROR:  failed to resolve name MakeExpandedObjectReadOnlyInternal

Well actually just select 'hello world' does it.  I've attached a backtrace.

Tab completion is broken for me with jit_above_cost = 0 due to
tab-complete.c queries failing with various other errors including:

set :
ERROR:  failed to resolve name ExecEvalScalarArrayOp

update :
ERROR:  failed to resolve name quote_ident

show :
ERROR:  failed to resolve name slot_getsomeattrs

I wasn't sure from your status message how much of this is expected at
this stage...

This is built from:

commit 302b7a284d30fb0e00eb5f0163aa933d4d9bea10 (HEAD -> jit, andresfreund/jit)

... plus the extern "C" tweak I posted earlier to make my clang 4.0
compiler happy, built on a FreeBSD 11.1 box with:

./configure --prefix=/home/munro/install/ --enable-tap-tests
--enable-cassert --enable-debug --enable-depend --with-llvm CC="ccache
cc" CXX="ccache c++" CXXFLAGS="-std=c++11"
LLVM_CONFIG=/usr/local/llvm50/bin/llvm-config
--with-libraries="/usr/local/lib" --with-includes="/usr/local/include"

The clang that was used for bitcode was the system /usr/bin/clang,
version 4.0.  Is it a problem that I used that for compiling the
bitcode, but LLVM5 for JIT?  I actually tried
CLANG=/usr/local/llvm50/bin/clang but ran into weird failures I
haven't got to the bottom of at ThinLink time so I couldn't get as far
as a running system.

I installed llvm50 from a package.  I did need to make a tiny tweak by
hand: in src/Makefile.global, llvm-config --system-libs had said
-l/usr/lib/libexecinfo.so which wasn't linking and looks wrong to me
so I changed it to -lexecinfo, noted that it worked and reported a bug
upstream:  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225621

-- 
Thomas Munro
http://www.enterprisedb.com
Core file '/home/munro/junk/pgdata/postgres.core' (x86_64) was loaded.
(lldb) bt
* thread #1, name = 'postgres', stop reason = signal SIGABRT
  * frame #0: 0x0008039e284a libc.so.7`__sys_thr_kill + 10
frame #1: 0x0008039e2814 libc.so.7`raise + 52
frame #2: 0x0008039e2789 libc.so.7`abort + 73
frame #3: postgres`ExceptionalCondition(conditionName=, 
errorType=, fileName=, lineNumber=) at 
assert.c:54
frame #4: postgres`llvm_resolve_symbol(name=, 
ctx=) at llvmjit.c:581
frame #5: 0x00bef86f 
postgres`_ZZN4llvm17OrcCBindingsStack14createResolverEPFmPKcPvES3_ENKUlRKNSt3__112basic_stringIcNS6_11char_traitsIcEENS6_9allocatorIcE_clESE_
 + 575
frame #6: 0x00bef602 
postgres`_ZN4llvm3orc14LambdaResolverIZNS_17OrcCBindingsStack14createResolverEPFmPKcPvES5_EUlRKNSt3__112basic_stringIcNS8_11char_traitsIcEENS8_9allocatorIcE_ZNS2_14createResolverES7_S5_EUlSG_E0_E24findSymbolInLogicalDylibESG_
 + 18
frame #7: 0x0187f063 
postgres`llvm::RuntimeDyldImpl::resolveExternalSymbols(void) + 675
frame #8: 0x0187e499 
postgres`llvm::RuntimeDyldImpl::resolveRelocations(void) + 201
frame #9: 0x01884b7e 
postgres`llvm::RuntimeDyld::finalizeWithMemoryManagerLocking(void) + 30
frame #10: 0x00bf127d 
postgres`_ZN4llvm3orc24RTDyldObjectLinkingLayer20ConcreteLinkedObjectINSt3__110shared_ptrINS_11RuntimeDyld13MemoryManagerEEENS4_INS_17JITSymbolResolverEEEZNS1_9addObjectENS4_INS_6object12OwningBinaryINSA_10ObjectFileES9_EUlNS3_15__list_iteratorINS3_10unique_ptrINS0_28RTDyldObjectLinkingLayerBase12LinkedObjectENS3_14default_deleteISI_PvEERS5_RKSE_NS3_8functionIFvv_E8finalizeEv
 + 221
frame #11: 0x00bf19fb 
postgres`_ZNSt3__128__invoke_void_return_wrapperIN4llvm8ExpectedImEEE6__callIJRZNS1_3orc24RTDyldObjectLinkingLayer20ConcreteLinkedObjectINS_10shared_ptrINS1_11RuntimeDyld13MemoryManagerEEENS9_INS1_17JITSymbolResolverEEEZNS7_9addObjectENS9_INS1_6object12OwningBinaryINSF_10ObjectFileESE_EUlNS_15__list_iteratorINS_10unique_ptrINS6_28RTDyldObjectLinkingLayerBase12LinkedObjectENS_14default_deleteISN_PvEERSA_RKSJ_NS_8functionIFvv_E21getSymbolMaterializerENS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEUlvE_EEES3_DpOT_
 + 59
frame #12: 0x00bf1982 

Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Thomas Munro
Another small thing which might be environmental... llvmjit_types.bc
is getting installed into ${prefix}/lib here, but you're looking for
it in ${prefix}/lib/postgresql:

gmake[3]: Entering directory '/usr/home/munro/projects/postgres/src/backend/lib'
/usr/bin/install -c -m 644 llvmjit_types.bc '/home/munro/install/lib'

postgres=# set jit_above_cost = 0;
SET
postgres=# set jit_expressions = on;
SET
postgres=# select 4 + 4;
ERROR:  
LLVMCreateMemoryBufferWithContentsOfFile(/usr/home/munro/install/lib/postgresql/llvmjit_types.bc)
failed: No such file or directory

$ mv ~/install/lib/llvmjit_types.bc ~/install/lib/postgresql/

postgres=# select 4 + 4;
 ?column?
--
8
(1 row)

-- 
Thomas Munro
http://www.enterprisedb.com



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Thomas Munro
On Fri, Feb 2, 2018 at 2:05 PM, Andres Freund  wrote:
> On 2018-02-01 09:32:17 -0800, Jeff Davis wrote:
>> On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
>>  wrote:
>> > The same problem takes place with old versions of GCC: I have to upgrade 
>> > GCC
>> > to 7.2 to make it possible to compile this code.
>> > The problem in not in compiler itself, but in libc++ headers.
>>
>> How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
>> and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
>> CXXFLAGS required?
>
> Just to understand: You're running in the issue with the header being
> included from within the extern "C" {}?  Hm, I've pushed a quick fix for
> that.

That change wasn't quite enough: to get this building against libc++
(Clang's native stdlb) I also needed this change to llvmjit.h so that
 wouldn't be included with the wrong linkage (perhaps
you can find a less ugly way):

+#ifdef __cplusplus
+}
+#endif
 #include 
+#ifdef __cplusplus
+extern "C"
+{
+#endif

> Other than that, you can compile with both gcc or clang, but clang needs
> to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
> (in that order) exist, similar with llvm-config llvm-config-5.0 being
> guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
> both of that. E.g.
> ./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
> is what I use, although I also add:
> LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
> so I don't have to install llvm anywhere the system knows about.

BTW if you're building with clang (vendor compiler on at least macOS
and FreeBSD) you'll probably need CXXFLAGS=-std=c++11 (or later
standard) because it's still defaulting to '98.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Andres Freund
On 2018-02-01 09:32:17 -0800, Jeff Davis wrote:
> On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
>  wrote:
> > The same problem takes place with old versions of GCC: I have to upgrade GCC
> > to 7.2 to make it possible to compile this code.
> > The problem in not in compiler itself, but in libc++ headers.
> 
> How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
> and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
> CXXFLAGS required?

Just to understand: You're running in the issue with the header being
included from within the extern "C" {}?  Hm, I've pushed a quick fix for
that.

Other than that, you can compile with both gcc or clang, but clang needs
to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
(in that order) exist, similar with llvm-config llvm-config-5.0 being
guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
both of that. E.g.
./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
is what I use, although I also add:
LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
so I don't have to install llvm anywhere the system knows about.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Jeff Davis
On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
 wrote:
> The same problem takes place with old versions of GCC: I have to upgrade GCC
> to 7.2 to make it possible to compile this code.
> The problem in not in compiler itself, but in libc++ headers.

How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
CXXFLAGS required?

Regards,
 Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Merlin Moncure
On Wed, Jan 31, 2018 at 1:45 PM, Robert Haas  wrote:
> On Wed, Jan 31, 2018 at 1:34 PM, Andres Freund  wrote:
>>> The first one is a problem that's not going to go away.  If the
>>> problem of JIT being enabled "magically" is something we're concerned
>>> about, we need to figure out a good solution, not just disable the
>>> feature by default.
>>
>> That's a fair argument, and I don't really have a good answer to it. We
>> could have a jit = off/try/on, and use that to signal things? I.e. it
>> can be set to try (possibly default in version + 1), and things will
>> work if it's not installed, but if set to on it'll refuse to work if not
>> enabled. Similar to how huge pages work now.
>
> We could do that, but I'd be more inclined just to let JIT be
> magically enabled.  In general, if a user could do 'yum install ip4r'
> (for example) and have that Just Work without any further database
> configuration, I think a lot of people would consider that to be a
> huge improvement.  Unfortunately we can't really do that for various
> reasons, the biggest of which is that there's no way for installing an
> OS package to modify the internal state of a database that may not
> even be running at the time.  But as a general principle, I think
> having to configure both the OS and the DB is an anti-feature, and
> that if installing an extra package is sufficient to get the
> new-and-improved behavior, users will like it.  Bonus points if it
> doesn't require a server restart.

You bet.  It'd be helpful to have some obvious, well advertised ways
to determine when it's enabled and when it isn't, and to have a
straightforward process to determine what to fix when it's not enabled
and the user thinks it ought to be though.

merlin



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Andres Freund
On 2018-02-01 08:46:08 -0500, Peter Eisentraut wrote:
> On 1/31/18 14:45, Robert Haas wrote:
> > We could do that, but I'd be more inclined just to let JIT be
> > magically enabled.  In general, if a user could do 'yum install ip4r'
> > (for example) and have that Just Work without any further database
> > configuration,
> 
> One way to do that would be to have a system-wide configuration file
> like /usr/local/pgsql/etc/postgresql/postgresql.conf, which in turn
> includes /usr/local/pgsql/etc/postgresql/postgreql.conf.d/*, and have
> the add-on package install its configuration file with the setting jit =
> on there.

I think Robert's comment about extensions wasn't about extensions and
jit, just about needing CREATE EXTENSION. I don't see any
need for per-extension/shlib configurability of JITing.


> Then again, if we want to make it simpler, just link the whole thing in
> and turn it on by default and be done with it.

I'd personally be ok with that too...

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Peter Eisentraut
On 1/31/18 14:45, Robert Haas wrote:
> We could do that, but I'd be more inclined just to let JIT be
> magically enabled.  In general, if a user could do 'yum install ip4r'
> (for example) and have that Just Work without any further database
> configuration,

One way to do that would be to have a system-wide configuration file
like /usr/local/pgsql/etc/postgresql/postgresql.conf, which in turn
includes /usr/local/pgsql/etc/postgresql/postgreql.conf.d/*, and have
the add-on package install its configuration file with the setting jit =
on there.

Then again, if we want to make it simpler, just link the whole thing in
and turn it on by default and be done with it.

Presumably, there will be planner-level knobs to model the jit startup
time, and if you don't like it, you can set that very high to disable
it.  So we don't necessarily need a separate turn-it-off-it's-broken
setting.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: JIT compiling with LLVM v9.0

2018-02-01 Thread Peter Eisentraut
On 1/31/18 13:34, Andres Freund wrote:
> That's a fair argument, and I don't really have a good answer to it. We
> could have a jit = off/try/on, and use that to signal things? I.e. it
> can be set to try (possibly default in version + 1), and things will
> work if it's not installed, but if set to on it'll refuse to work if not
> enabled. Similar to how huge pages work now.

But that setup also has the problem that you can't query the setting to
know whether it's actually on.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: JIT compiling with LLVM v9.0

2018-01-31 Thread Robert Haas
On Wed, Jan 31, 2018 at 2:49 PM, Andres Freund  wrote:
>> We could do that, but I'd be more inclined just to let JIT be
>> magically enabled.  In general, if a user could do 'yum install ip4r'
>> (for example) and have that Just Work without any further database
>> configuration, I think a lot of people would consider that to be a
>> huge improvement.  Unfortunately we can't really do that for various
>> reasons, the biggest of which is that there's no way for installing an
>> OS package to modify the internal state of a database that may not
>> even be running at the time.  But as a general principle, I think
>> having to configure both the OS and the DB is an anti-feature, and
>> that if installing an extra package is sufficient to get the
>> new-and-improved behavior, users will like it.
>
> I'm not seing a contradiction between what you describe as desired, and
> what I describe?  If it defaulted to try, that'd just do what you want,
> no? I do think it's important to configure the system so it'll error if
> JITing is not available.

Hmm, I guess that's true.  I'm not sure that we really need a way to
error out if JIT is not available, but maybe we do.

>> Bonus points if it doesn't require a server restart.
>
> I think server restart might be doable (although it'll increase memory
> usage because the shlib needs to be loaded in each backend rather than
> postmaster), but once a session is running I'm fairly sure we do not
> want to retry. Re-checking whether a shlib is available on the
> filesystem every query does not sound like a good idea...

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-31 Thread Andres Freund
On 2018-01-31 14:45:46 -0500, Robert Haas wrote:
> On Wed, Jan 31, 2018 at 1:34 PM, Andres Freund  wrote:
> >> The first one is a problem that's not going to go away.  If the
> >> problem of JIT being enabled "magically" is something we're concerned
> >> about, we need to figure out a good solution, not just disable the
> >> feature by default.
> >
> > That's a fair argument, and I don't really have a good answer to it. We
> > could have a jit = off/try/on, and use that to signal things? I.e. it
> > can be set to try (possibly default in version + 1), and things will
> > work if it's not installed, but if set to on it'll refuse to work if not
> > enabled. Similar to how huge pages work now.
> 
> We could do that, but I'd be more inclined just to let JIT be
> magically enabled.  In general, if a user could do 'yum install ip4r'
> (for example) and have that Just Work without any further database
> configuration, I think a lot of people would consider that to be a
> huge improvement.  Unfortunately we can't really do that for various
> reasons, the biggest of which is that there's no way for installing an
> OS package to modify the internal state of a database that may not
> even be running at the time.  But as a general principle, I think
> having to configure both the OS and the DB is an anti-feature, and
> that if installing an extra package is sufficient to get the
> new-and-improved behavior, users will like it.

I'm not seing a contradiction between what you describe as desired, and
what I describe?  If it defaulted to try, that'd just do what you want,
no? I do think it's important to configure the system so it'll error if
JITing is not available.


> Bonus points if it doesn't require a server restart.

I think server restart might be doable (although it'll increase memory
usage because the shlib needs to be loaded in each backend rather than
postmaster), but once a session is running I'm fairly sure we do not
want to retry. Re-checking whether a shlib is available on the
filesystem every query does not sound like a good idea...

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-31 Thread Andres Freund
Hi,

On 2018-01-31 11:56:59 -0500, Robert Haas wrote:
> On Tue, Jan 30, 2018 at 5:57 PM, Andres Freund  wrote:
> > Given that we need a shared library it'll be best buildsystem wise if
> > all of this is in a directory, and there's a separate file containing
> > the stubs that call into it.
> >
> > I'm not quite sure where to put the code. I'm a bit inclined to add a
> > new
> > src/backend/jit/
> > because we're dealing with code from across different categories? There
> > we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
> > specific code?
> 
> That's kind of ugly, in that if we eventually end up with many
> different parts of the system using JIT, they're all going to have to
> all put their code in that directory rather than putting it with the
> subsystem to which it pertains.

Yea, that's what I really dislike about the idea too.

> On the other hand, I don't really have a better idea.

I guess one alternative would be to leave the individual files in their
subsystem directories, but not in the corresponding OBJS lists, and
instead pick them up from the makefile in the jit shlib?  That might
better...

It's a bit weird because the files would be compiled when make-ing that
directory and rather when the jit shlib one made, but that's not too
bad.


> I'd definitely at least try to keep executor-specific considerations
> in a separate FILE from general JIT infrastructure, and make, as far
> as possible, a clean separation at the API level.

Absolutely.  Right now there's general infrastructure files (error
handling, optimization, inlining), expression compilation, tuple deform
compilation, and I thought to continue keeping the files separately just
like that.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-31 Thread Andres Freund
Hi,

On 2018-01-31 11:53:25 -0500, Robert Haas wrote:
> On Wed, Jan 31, 2018 at 10:22 AM, Peter Eisentraut
>  wrote:
> > On 1/30/18 21:55, Andres Freund wrote:
> >> I'm open to changing my mind on it, but it seems a bit weird that a
> >> feature that relies on a shlib being installed magically turns itself on
> >> if avaible. And leaving that angle aside, ISTM, that it's a complex
> >> enough feature that it should be opt-in the first release... Think we
> >> roughly did that right for e.g. parallellism.
> >
> > That sounds reasonable, for both of those reasons.
> 
> The first one is a problem that's not going to go away.  If the
> problem of JIT being enabled "magically" is something we're concerned
> about, we need to figure out a good solution, not just disable the
> feature by default.

That's a fair argument, and I don't really have a good answer to it. We
could have a jit = off/try/on, and use that to signal things? I.e. it
can be set to try (possibly default in version + 1), and things will
work if it's not installed, but if set to on it'll refuse to work if not
enabled. Similar to how huge pages work now.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-31 Thread Robert Haas
On Tue, Jan 30, 2018 at 5:57 PM, Andres Freund  wrote:
> Given that we need a shared library it'll be best buildsystem wise if
> all of this is in a directory, and there's a separate file containing
> the stubs that call into it.
>
> I'm not quite sure where to put the code. I'm a bit inclined to add a
> new
> src/backend/jit/
> because we're dealing with code from across different categories? There
> we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
> specific code?

That's kind of ugly, in that if we eventually end up with many
different parts of the system using JIT, they're all going to have to
all put their code in that directory rather than putting it with the
subsystem to which it pertains.  On the other hand, I don't really
have a better idea.  I'd definitely at least try to keep
executor-specific considerations in a separate FILE from general JIT
infrastructure, and make, as far as possible, a clean separation at
the API level.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-31 Thread Robert Haas
On Wed, Jan 31, 2018 at 10:22 AM, Peter Eisentraut
 wrote:
> On 1/30/18 21:55, Andres Freund wrote:
>> I'm open to changing my mind on it, but it seems a bit weird that a
>> feature that relies on a shlib being installed magically turns itself on
>> if avaible. And leaving that angle aside, ISTM, that it's a complex
>> enough feature that it should be opt-in the first release... Think we
>> roughly did that right for e.g. parallellism.
>
> That sounds reasonable, for both of those reasons.

The first one is a problem that's not going to go away.  If the
problem of JIT being enabled "magically" is something we're concerned
about, we need to figure out a good solution, not just disable the
feature by default.

As far as the second one, looking back at what happened with parallel
query, I found (on a quick read) 13 back-patched commits in
REL9_6_STABLE prior to the release of 10.0, 3 of which I would qualify
as low-importance (improving documentation, fixing something that's
not really a bug, improving a test case).  A couple of those were
really stupid mistakes on my part.  On the other hand, would it have
been overall worse for our users if that feature had been turned on in
9.6?  I don't know.  They would have had those bugs (at least until we
fixed them) but they would have had parallel query, too.  It's hard
for me to judge whether that was a win or a loss, and so here.  Like
parallel query, this is a feature which seems to have a low risk of
data corruption, but a fairly high risk of wrong answers to queries
and/or strange errors.   Users don't like that.  On the other hand,
also like parallel query, if you've got the right kind of queries, it
can make them go a lot faster.  Users DO like that.

So I could go either way on whether to enable this in the first
release.  I definitely would not like to see it stay disabled by
default for a second release unless we find a lot of problems with it.
There's no point in developing new features unless users are going to
get the benefit of them, and while SOME users will enable features
that aren't turned on by default, many will not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-31 Thread Peter Eisentraut
On 1/30/18 21:55, Andres Freund wrote:
> I'm open to changing my mind on it, but it seems a bit weird that a
> feature that relies on a shlib being installed magically turns itself on
> if avaible. And leaving that angle aside, ISTM, that it's a complex
> enough feature that it should be opt-in the first release... Think we
> roughly did that right for e.g. parallellism.

That sounds reasonable, for both of those reasons.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: JIT compiling with LLVM v9.0

2018-01-31 Thread Konstantin Knizhnik



On 31.01.2018 05:48, Thomas Munro wrote:



This seems to be a valid complaint.  I don't think you should be
(indirectly) wrapping Types.h in extern "C".  At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?

Hm, this seems like a bit of pointless nitpickery by the compiler to me,
but I guess...

Well that got me curious about how GCC could possibly be accepting
that (it certainly doesn't like extern "C" template ... any more than
the next compiler).  I dug a bit and realised that it's the stdlib
that's different:  libstdc++ has its own extern "C++" in ,
while  libc++ doesn't.

The same problem takes place with old versions of GCC: I have to upgrade 
GCC to 7.2 to make it possible to compile this code.

The problem in not in compiler itself, but in libc++ headers.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-31 15:48:09 +1300, Thomas Munro wrote:
> On Wed, Jan 31, 2018 at 3:05 PM, Andres Freund  wrote:
> > I'm not quite sure I understand. You mean have it display whether
> > available? I think my plan is to "just" set jit_expressions=on (or
> > whatever we're going to name it) fail if the prerequisites aren't
> > available. I personally don't think this should be enabled by default,
> > definitely not in the first release.
> 
> I assumed (incorrectly) that you wanted it to default to on if
> available, so I was suggesting making it obvious to end users if
> they've accidentally forgotten to install -jit.  If it's not enabled
> until you actually ask for it and trying to enable it when it's not
> installed barfs, then that seems sensible.

I'm open to changing my mind on it, but it seems a bit weird that a
feature that relies on a shlib being installed magically turns itself on
if avaible. And leaving that angle aside, ISTM, that it's a complex
enough feature that it should be opt-in the first release... Think we
roughly did that right for e.g. parallellism.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Thomas Munro
On Wed, Jan 31, 2018 at 3:05 PM, Andres Freund  wrote:
> On 2018-01-31 14:42:26 +1300, Thomas Munro wrote:
>> I'm just starting to look at this (amazing) work, and I don't have a
>> strong opinion yet.  But certainly, making it easy for packagers to
>> put the -jit stuff into a separate package for the reasons already
>> given sounds sensible to me.  Some systems package LLVM as one
>> gigantic package that'll get you 1GB of compiler/debugger/other stuff
>> and perhaps violate local rules by installing a compiler when you
>> really just wanted libLLVM{whatever}.so.  I guess it should be made
>> very clear to users (explain plans, maybe startup message, ...?)
>
> I'm not quite sure I understand. You mean have it display whether
> available? I think my plan is to "just" set jit_expressions=on (or
> whatever we're going to name it) fail if the prerequisites aren't
> available. I personally don't think this should be enabled by default,
> definitely not in the first release.

I assumed (incorrectly) that you wanted it to default to on if
available, so I was suggesting making it obvious to end users if
they've accidentally forgotten to install -jit.  If it's not enabled
until you actually ask for it and trying to enable it when it's not
installed barfs, then that seems sensible.

>> This seems to be a valid complaint.  I don't think you should be
>> (indirectly) wrapping Types.h in extern "C".  At a guess, your
>> llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
>> specifiers, so you can use it from C or C++, but making sure that you
>> don't #include LLVM's headers from a bizarro context where __cplusplus
>> is defined but the linkage is unexpectedly already "C"?
>
> Hm, this seems like a bit of pointless nitpickery by the compiler to me,
> but I guess...

Well that got me curious about how GCC could possibly be accepting
that (it certainly doesn't like extern "C" template ... any more than
the next compiler).  I dug a bit and realised that it's the stdlib
that's different:  libstdc++ has its own extern "C++" in ,
while  libc++ doesn't.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
On 2018-01-31 14:42:26 +1300, Thomas Munro wrote:
> I'm just starting to look at this (amazing) work, and I don't have a
> strong opinion yet.  But certainly, making it easy for packagers to
> put the -jit stuff into a separate package for the reasons already
> given sounds sensible to me.  Some systems package LLVM as one
> gigantic package that'll get you 1GB of compiler/debugger/other stuff
> and perhaps violate local rules by installing a compiler when you
> really just wanted libLLVM{whatever}.so.  I guess it should be made
> very clear to users (explain plans, maybe startup message, ...?)

I'm not quite sure I understand. You mean have it display whether
available? I think my plan is to "just" set jit_expressions=on (or
whatever we're going to name it) fail if the prerequisites aren't
available. I personally don't think this should be enabled by default,
definitely not in the first release.

> $ c++ -v
> FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
> LLVM 4.0.0)
> 
> This seems to be a valid complaint.  I don't think you should be
> (indirectly) wrapping Types.h in extern "C".  At a guess, your
> llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
> specifiers, so you can use it from C or C++, but making sure that you
> don't #include LLVM's headers from a bizarro context where __cplusplus
> is defined but the linkage is unexpectedly already "C"?

Hm, this seems like a bit of pointless nitpickery by the compiler to me,
but I guess...

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Thomas Munro
On Wed, Jan 31, 2018 at 11:57 AM, Andres Freund  wrote:
> On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
>> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
>> > It's an optional dependency, and it doesn't increase build time that
>> > much... If we were to move the llvm interfacing code to a .so, there'd
>> > not even be a packaging issue, you can just package that .so separately
>> > and get errors if somebody tries to enable LLVM without that .so being
>> > installed.
>>
>> I suspect that would be really valuable.  If 'yum install
>> postgresql-server' (or your favorite equivalent) sucks down all of
>> LLVM, some people are going to complain, either because they are
>> trying to build little tiny machine images or because they are subject
>> to policies which preclude the presence of a compiler on a production
>> server.  If you can do 'yum install postgresql-server' without
>> additional dependencies and 'yum install postgresql-server-jit' to
>> make it go faster, that issue is solved.
>
> So, I'm working on that now.  In the course of this I'll be
> painfully rebase and rename a lot of code, which I'd like not to repeat
> unnecessarily.
>
> Right now there primarily is:
>
> src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
> src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
> src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
> src/backend/access/common/heaptuple.c - emit LLVM IR for deforming
>
> Given that we need a shared library it'll be best buildsystem wise if
> all of this is in a directory, and there's a separate file containing
> the stubs that call into it.
>
> I'm not quite sure where to put the code. I'm a bit inclined to add a
> new
> src/backend/jit/
> because we're dealing with code from across different categories? There
> we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
> specific code?
>
> Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
> and the actual llvm using code into src/backend/executor/llvmjit/?
>
> Comments?

I'm just starting to look at this (amazing) work, and I don't have a
strong opinion yet.  But certainly, making it easy for packagers to
put the -jit stuff into a separate package for the reasons already
given sounds sensible to me.  Some systems package LLVM as one
gigantic package that'll get you 1GB of compiler/debugger/other stuff
and perhaps violate local rules by installing a compiler when you
really just wanted libLLVM{whatever}.so.  I guess it should be made
very clear to users (explain plans, maybe startup message, ...?)
whether JIT support is active/installed so that people are at least
very aware when they encounter a system that is interpreting stuff it
could be compiling.   Putting all the JIT into a separate directory
under src/backend/jit certainly looks sensible at first glance, but
I'm not sure.

Incidentally, from commit fdc6c7a6dddbd6df63717f2375637660bcd00fc6
(HEAD -> jit, andresfreund/jit) on your branch I get:

ccache c++ -Wall -Wpointer-arith -fno-strict-aliasing -fwrapv -g -g
-O2 -fno-exceptions -I../../../src/include
-I/usr/local/llvm50/include -DLLVM_BUILD_GLOBAL_ISEL
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-I/usr/local/include  -c -o llvmjit_error.o llvmjit_error.cpp -MMD -MP
-MF .deps/llvmjit_error.Po
In file included from llvmjit_error.cpp:26:
In file included from ../../../src/include/lib/llvmjit.h:48:
In file included from /usr/local/llvm50/include/llvm-c/Types.h:17:
In file included from /usr/local/llvm50/include/llvm/Support/DataTypes.h:33:
/usr/include/c++/v1/cmath:555:1: error: templates must have C++ linkage
template 
^~~~
llvmjit_error.cpp:24:1: note: extern "C" language linkage
specification begins here
extern "C"
^

$ c++ -v
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
LLVM 4.0.0)

This seems to be a valid complaint.  I don't think you should be
(indirectly) wrapping Types.h in extern "C".  At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Jason Petersen
> On Jan 30, 2018, at 2:08 PM, Andres Freund  wrote:
> 
> With things like apt recommends and such I don't think this is a huge problem.


I don’t believe there is a similar widely-supported dependency type in yum/rpm, 
though. rpm 4.12 adds support for Weak Dependencies, which have 
Recommends/Suggests-style semantics, but AFAIK it’s not going to be on most RPM 
machines (I haven’t checked most OSes yet, but IIRC it’s mostly a Fedora thing 
at this point?)

Which means in the rpm packages we’ll have to decide whether this is required 
or must be opt-in by end users (which as discussed would hurt adoption).

--
Jason Petersen
Software Engineer | Citus Data
303.736.9255
ja...@citusdata.com



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
> > It's an optional dependency, and it doesn't increase build time that
> > much... If we were to move the llvm interfacing code to a .so, there'd
> > not even be a packaging issue, you can just package that .so separately
> > and get errors if somebody tries to enable LLVM without that .so being
> > installed.
> 
> I suspect that would be really valuable.  If 'yum install
> postgresql-server' (or your favorite equivalent) sucks down all of
> LLVM, some people are going to complain, either because they are
> trying to build little tiny machine images or because they are subject
> to policies which preclude the presence of a compiler on a production
> server.  If you can do 'yum install postgresql-server' without
> additional dependencies and 'yum install postgresql-server-jit' to
> make it go faster, that issue is solved.

So, I'm working on that now.  In the course of this I'll be
painfully rebase and rename a lot of code, which I'd like not to repeat
unnecessarily.

Right now there primarily is:

src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
src/backend/access/common/heaptuple.c - emit LLVM IR for deforming

Given that we need a shared library it'll be best buildsystem wise if
all of this is in a directory, and there's a separate file containing
the stubs that call into it.

I'm not quite sure where to put the code. I'm a bit inclined to add a
new
src/backend/jit/
because we're dealing with code from across different categories? There
we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
specific code?

Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
and the actual llvm using code into src/backend/executor/llvmjit/?

Comments?

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-30 22:57:06 +0100, David Fetter wrote:
> On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
> > On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
> > > It's an optional dependency, and it doesn't increase build time
> > > that much... If we were to move the llvm interfacing code to a
> > > .so, there'd not even be a packaging issue, you can just package
> > > that .so separately and get errors if somebody tries to enable
> > > LLVM without that .so being installed.
> > 
> > I suspect that would be really valuable.  If 'yum install
> > postgresql-server' (or your favorite equivalent) sucks down all of
> > LLVM,
> 
> As I understand it, LLVM is organized in such a way as not to require
> this.  Andres, am I understanding correctly that what you're using
> doesn't require much of LLVM at runtime?

I'm not sure what you exactly mean. Yes, you need the llvm library at
runtime. Perhaps you're thinking of clang or llvm binarieries? The
latter we *not* need.

What's required is something like:
$ apt show libllvm5.0
Package: libllvm5.0
Version: 1:5.0.1-2
Priority: optional
Section: libs
Source: llvm-toolchain-5.0
Maintainer: LLVM Packaging Team 
Installed-Size: 56.9 MB
Depends: libc6 (>= 2.15), libedit2 (>= 2.11-20080614), libffi6 (>= 3.0.4), 
libgcc1 (>= 1:3.4), libstdc++6 (>= 6), libtinfo5 (>= 6), zlib1g (>= 1:1.2.0)
Breaks: libllvm3.9v4
Replaces: libllvm3.9v4
Homepage: http://www.llvm.org/
Tag: role::shared-lib
Download-Size: 13.7 MB
APT-Manual-Installed: no
APT-Sources: http://debian.osuosl.org/debian unstable/main amd64 Packages
Description: Modular compiler and toolchain technologies, runtime library
 LLVM is a collection of libraries and tools that make it easy to build
 compilers, optimizers, just-in-time code generators, and many other
 compiler-related programs.
 .
 This package contains the LLVM runtime library.

So ~14MB to download, ~57MB on disk.  We only need a subset of
libllvm5.0, and LLVM allows to build such a subset. But obviously
distributions aren't going to target their LLVM just for postgres.


> > Unfortunately, that has the pretty significant downside that a lot of
> > people who actually want the postgresql-server-jit package will not
> > realize that they need to install it, which sucks.
> 
> It does indeed.

With things like apt recommends and such I don't think this is a huge
problem.  It'll be installed by default unless somebody is on a space
constrained system and doesn't want that...

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread David Fetter
On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
> > It's an optional dependency, and it doesn't increase build time
> > that much... If we were to move the llvm interfacing code to a
> > .so, there'd not even be a packaging issue, you can just package
> > that .so separately and get errors if somebody tries to enable
> > LLVM without that .so being installed.
> 
> I suspect that would be really valuable.  If 'yum install
> postgresql-server' (or your favorite equivalent) sucks down all of
> LLVM,

As I understand it, LLVM is organized in such a way as not to require
this.  Andres, am I understanding correctly that what you're using
doesn't require much of LLVM at runtime?

> some people are going to complain, either because they are
> trying to build little tiny machine images or because they are
> subject to policies which preclude the presence of a compiler on a
> production server.  If you can do 'yum install postgresql-server'
> without additional dependencies and 'yum install
> postgresql-server-jit' to make it go faster, that issue is solved.

Would you consider it solved if there were some very small part of the
LLVM (or similar JIT-capable) toolchain added as a dependency, or does
it need to be optional into a long future?

> Unfortunately, that has the pretty significant downside that a lot of
> people who actually want the postgresql-server-jit package will not
> realize that they need to install it, which sucks.

It does indeed.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Tomas Vondra


On 01/30/2018 12:24 AM, Andres Freund wrote:
> Hi,
> 
> On 2018-01-30 00:16:46 +0100, Tomas Vondra wrote:
>> FWIW I've installed llvm 5.0.1 from distribution package, and now
>> everything builds fine (I don't even need the configure tweak).
>>
>> I think I had to build the other binaries because there was no 5.x llvm
>> back then, but it's too far back so I don't remember.
>>
>> Anyway, seems I'm fine for now.
> 
> Phew, I'm relieved.  I'd guess you buily a 5.0 version while 5.0 was
> still in development, so not all 5.0 functionality was available. Hence
> the inconsistent looking result.  While I think we can support 4.0
> without too much problem, there's obviously no point in trying to
> support old between releases versions...
> 

That's quite possible, but I don't really remember :-/

But I ran into another issue today, where everything builds fine (llvm
5.0.1, gcc 6.4.0), but at runtime I get errors like this:

ERROR:
LLVMCreateMemoryBufferWithContentsOfFile(/home/tomas/pg-llvm/lib/postgresql/llvmjit_types.bc)
failed: No such file or directory

It seems the llvmjit_types.bc file ended up in the parent directory
(/home/tomas/pg-llvm/lib/) for some reason. After simply copying it to
the expected place everything started working.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-30 15:06:02 -0500, Robert Haas wrote:
> On Tue, Jan 30, 2018 at 2:08 PM, Andres Freund  wrote:
> >> That bites, although it's probably tolerable if we expect such errors
> >> only in exceptional situations such as a needed shared library failing
> >> to load or something. Killing the session when we run out of memory
> >> during JIT compilation is not very nice at all.  Does the LLVM library
> >> have any useful hooks that we can leverage here, like a hypothetical
> >> function LLVMProvokeFailureAsSoonAsConvenient()?
> >
> > I don't see how that'd help if a memory allocation fails? We can't just
> > continue in that case? You could arguably have reserve memory pool that
> > you release in that case and then try to continue, but that seems
> > awfully fragile.
> 
> Well, I'm just asking what the library supports.  For example:
> 
> https://curl.haxx.se/libcurl/c/CURLOPT_PROGRESSFUNCTION.html

I get that type of function, what I don't understand how that applies to
OOM:

> If you had something like that, you could arrange to safely interrupt
> the library the next time the progress-function was called.

Yea, but how are you going to *get* to the next time, given that an
allocator just couldn't allocate memory? You can't just return a NULL
pointer because the caller will use that memory?


> > The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
> > if profiling can only be done by a superuser? Hm :/
> 
> The server's ~/.debug/jit?  Or are you somehow getting the output to the 
> client?

Yes, the servers - I'm not sure I understand the "client" bit? It's
about perf profiling, which isn't available to the client either?


Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Dagfinn Ilmari Mannsåker
Robert Haas  writes:

> Unfortunately, that has the pretty significant downside that a lot of
> people who actually want the postgresql-server-jit package will not
> realize that they need to install it, which sucks.  But I think it
> might still be the better way to go.  Anyway, it's for individual
> packagers to cope with that problem; as far as the patch goes, +1 for
> structuring things in a way which gives packagers the option to divide
> it up that way.

I don't know about rpm/yum/dnf, but in dpkg/apt one could declare that
postgresql-server recommends postgresql-server-jit, which installs the
package by default, but can be overridden by config or on the command
line.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 2:08 PM, Andres Freund  wrote:
>> That bites, although it's probably tolerable if we expect such errors
>> only in exceptional situations such as a needed shared library failing
>> to load or something. Killing the session when we run out of memory
>> during JIT compilation is not very nice at all.  Does the LLVM library
>> have any useful hooks that we can leverage here, like a hypothetical
>> function LLVMProvokeFailureAsSoonAsConvenient()?
>
> I don't see how that'd help if a memory allocation fails? We can't just
> continue in that case? You could arguably have reserve memory pool that
> you release in that case and then try to continue, but that seems
> awfully fragile.

Well, I'm just asking what the library supports.  For example:

https://curl.haxx.se/libcurl/c/CURLOPT_PROGRESSFUNCTION.html

If you had something like that, you could arrange to safely interrupt
the library the next time the progress-function was called.

> The ones I looked at just error out.  Needing to handle OOM in soft fail
> manner isn't actually that common a demand, I guess :/.

Bummer.

> I mean we could periodically rescan, rescan after sighup, or such? But
> that seems like something for later to me. It's not going to be super
> common to install new extensions while a lot of sessions are
> running. And things will work in that case, the functions just won't get 
> inlined...

Fair enough.

>> > Do people feel these should be hidden behind #ifdefs, always present but
>> > prevent from being set to a meaningful, or unrestricted?
>>
>> We shouldn't allow non-superusers to set any GUC that dumps files to
>> the data directory or provides an easy to way to crash the server, run
>> the machine out of memory, or similar.
>
> I don't buy the OOM one - there's so so so many of those already...
>
> The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
> if profiling can only be done by a superuser? Hm :/

The server's ~/.debug/jit?  Or are you somehow getting the output to the client?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-30 13:57:50 -0500, Robert Haas wrote:
> > When a libstdc++ new or LLVM error occurs, the handlers set up by the
> > above functions trigger a FATAL error. We have to use FATAL rather than
> > ERROR, as we *cannot* reliably throw ERROR inside a foreign library
> > without risking corrupting its internal state.
> 
> That bites, although it's probably tolerable if we expect such errors
> only in exceptional situations such as a needed shared library failing
> to load or something. Killing the session when we run out of memory
> during JIT compilation is not very nice at all.  Does the LLVM library
> have any useful hooks that we can leverage here, like a hypothetical
> function LLVMProvokeFailureAsSoonAsConvenient()?

I don't see how that'd help if a memory allocation fails? We can't just
continue in that case? You could arguably have reserve memory pool that
you release in that case and then try to continue, but that seems
awfully fragile.


> The equivalent function for PostgreSQL would do { InterruptPending =
> true; QueryCancelPending = true; }.  And maybe
> LLVMSetProgressCallback() that would get called periodically and let
> us set a handler that could check for interrupts on the PostgreSQL
> side and then call LLVMProvokeFailureAsSoonAsConvenient() as
> applicable?  This problem can't be completely unique to PostgreSQL;
> anybody who is using LLVM for JIT from a long-running process needs a
> solution, so you might think that the library would provide one.

The ones I looked at just error out.  Needing to handle OOM in soft fail
manner isn't actually that common a demand, I guess :/.


> > for all .index.bc files and a *combined* index over all these files is
> > built in memory.  The reason for doing so is that that allows "easy"
> > access to inlining access for extensions - they can install code into
> >   $pkglibdir/bitcode/[extension]/
> > accompanied by
> >   $pkglibdir/bitcode/[extension].index.bc
> > just alongside the actual library.
> 
> But that means that if an extension is installed after the initial
> scan has been done, concurrent sessions won't notice the new files.
> Maybe that's OK, but I wonder if we can do better.

I mean we could periodically rescan, rescan after sighup, or such? But
that seems like something for later to me. It's not going to be super
common to install new extensions while a lot of sessions are
running. And things will work in that case, the functions just won't get 
inlined...


> > Do people feel these should be hidden behind #ifdefs, always present but
> > prevent from being set to a meaningful, or unrestricted?
> 
> We shouldn't allow non-superusers to set any GUC that dumps files to
> the data directory or provides an easy to way to crash the server, run
> the machine out of memory, or similar.

I don't buy the OOM one - there's so so so many of those already...

The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
if profiling can only be done by a superuser? Hm :/

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Robert Haas
On Wed, Jan 24, 2018 at 2:20 AM, Andres Freund  wrote:
> == Error handling ==
>
> There's two aspects to error handling.
>
> Firstly, generated (LLVM IR) and emitted functions (mmap()ed segments)
> need to be cleaned up both after a successful query execution and after
> an error.  I've settled on a fairly boring resowner based mechanism. On
> errors all expressions owned by a resowner are released, upon success
> expressions are reassigned to the parent / released on commit (unless
> executor shutdown has cleaned them up of course).

Cool.

> A second, less pretty and newly developed, aspect of error handling is
> OOM handling inside LLVM itself. The above resowner based mechanism
> takes care of cleaning up emitted code upon ERROR, but there's also the
> chance that LLVM itself runs out of memory. LLVM by default does *not*
> use any C++ exceptions. It's allocations are primarily funneled through
> the standard "new" handlers, and some direct use of malloc() and
> mmap(). For the former a 'new handler' exists
> http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
> latter LLVM provides callback that get called upon failure
> (unfortunately mmap() failures are treated as fatal rather than OOM
> errors).
> What I've chosen to do, and I'd be interested to get some input about
> that, is to have two functions that LLVM using code must use:
>   extern void llvm_enter_fatal_on_oom(void);
>   extern void llvm_leave_fatal_on_oom(void);
> before interacting with LLVM code (ie. emitting IR, or using the above
> functions) llvm_enter_fatal_on_oom() needs to be called.
>
> When a libstdc++ new or LLVM error occurs, the handlers set up by the
> above functions trigger a FATAL error. We have to use FATAL rather than
> ERROR, as we *cannot* reliably throw ERROR inside a foreign library
> without risking corrupting its internal state.

That bites, although it's probably tolerable if we expect such errors
only in exceptional situations such as a needed shared library failing
to load or something. Killing the session when we run out of memory
during JIT compilation is not very nice at all.  Does the LLVM library
have any useful hooks that we can leverage here, like a hypothetical
function LLVMProvokeFailureAsSoonAsConvenient()?  The equivalent
function for PostgreSQL would do { InterruptPending = true;
QueryCancelPending = true; }.  And maybe LLVMSetProgressCallback()
that would get called periodically and let us set a handler that could
check for interrupts on the PostgreSQL side and then call
LLVMProvokeFailureAsSoonAsConvenient() as applicable?  This problem
can't be completely unique to PostgreSQL; anybody who is using LLVM
for JIT from a long-running process needs a solution, so you might
think that the library would provide one.

> This facility allows us to get the bitcode for all operators
> (e.g. int8eq, float8pl etc), without maintaining two copies. The way
> I've currently set it up is that, if --with-llvm is passed to configure,
> all backend files are also compiled to bitcode files.  These bitcode
> files get installed into the server's
>   $pkglibdir/bitcode/postgres/
> under their original subfolder, eg.
>   ~/build/postgres/dev-assert/install/lib/bitcode/postgres/utils/adt/float.bc
> Using existing LLVM functionality (for parallel LTO compilation),
> additionally an index is over these is stored to
>   $pkglibdir/bitcode/postgres.index.bc

That sounds pretty sweet.

> When deciding to JIT for the first time, $pkglibdir/bitcode/ is scanned
> for all .index.bc files and a *combined* index over all these files is
> built in memory.  The reason for doing so is that that allows "easy"
> access to inlining access for extensions - they can install code into
>   $pkglibdir/bitcode/[extension]/
> accompanied by
>   $pkglibdir/bitcode/[extension].index.bc
> just alongside the actual library.

But that means that if an extension is installed after the initial
scan has been done, concurrent sessions won't notice the new files.
Maybe that's OK, but I wonder if we can do better.

> Do people feel these should be hidden behind #ifdefs, always present but
> prevent from being set to a meaningful, or unrestricted?

We shouldn't allow non-superusers to set any GUC that dumps files to
the data directory or provides an easy to way to crash the server, run
the machine out of memory, or similar.  GUCs that just print stuff, or
make queries faster/slower, can be set by anyone, I think.  I favor
having the debugging stuff available in the default build.  This
feature has a chance of containing bugs, and those bugs will be hard
to troubleshoot if the first step in getting information on what went
wrong is "recompile".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Jeff Davis
Hi,

On Mon, Jan 29, 2018 at 10:40 AM, Andres Freund  wrote:
> Hi,
>
> On 2018-01-29 10:28:18 -0800, Jeff Davis wrote:
>> OK. How about this: are you open to changes that move us in the
>> direction of extensibility later? (By this I do *not* mean imposing a
>> bunch of requirements on you... either small changes to your patches
>> or something part of another commit.)
>
> I'm good with that.
>
>
>> Or are you determined that this always should be a part of core?

> I'm strongly against there not being an in-core JIT. I'm not at all
> against adding APIs that allow to do different JIT implementations out
> of core.

I can live with that.

I recommend that you discuss with packagers and a few others, to
reduce the chance of disagreement later.

> Well, the source would require an actual compiler around. And the
> inlining *just* for the function code itself isn't actually that
> interesting, you e.g. want to also be able to

I think you hit enter too quicly... what's the rest of that sentence?

Regards,
  Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund
Hi,

On 2018-01-30 00:16:46 +0100, Tomas Vondra wrote:
> FWIW I've installed llvm 5.0.1 from distribution package, and now
> everything builds fine (I don't even need the configure tweak).
> 
> I think I had to build the other binaries because there was no 5.x llvm
> back then, but it's too far back so I don't remember.
> 
> Anyway, seems I'm fine for now.

Phew, I'm relieved.  I'd guess you buily a 5.0 version while 5.0 was
still in development, so not all 5.0 functionality was available. Hence
the inconsistent looking result.  While I think we can support 4.0
without too much problem, there's obviously no point in trying to
support old between releases versions...

> Sorry for the noise.

No worries.

- Andres



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Tomas Vondra


On 01/29/2018 11:49 PM, Tomas Vondra wrote:
> 
> ...
>
> and that indeed changes the failure to this:
> 
> Writing postgres.bki
> Writing schemapg.h
> Writing postgres.description
> Writing postgres.shdescription
> llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
> llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
> a member of ‘llvm’
>llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
>^~~~
> llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
> llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>llvm::remove_bad_alloc_error_handler();
>^~~~
> llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
> llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>llvm::remove_bad_alloc_error_handler();
>^~~~
> make[3]: *** [: llvmjit_error.o] Error 1
> make[2]: *** [common.mk:45: lib-recursive] Error 2
> make[2]: *** Waiting for unfinished jobs
> make[1]: *** [Makefile:38: all-backend-recurse] Error 2
> make: *** [GNUmakefile:11: all-src-recurse] Error 2
> 
> 
> I'm not sure what that means, though ... maybe I really have system
> broken in some strange way.
> 

FWIW I've installed llvm 5.0.1 from distribution package, and now
everything builds fine (I don't even need the configure tweak).

I think I had to build the other binaries because there was no 5.x llvm
back then, but it's too far back so I don't remember.

Anyway, seems I'm fine for now. Sorry for the noise.


-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund
Hi,

On 2018-01-29 23:49:14 +0100, Tomas Vondra wrote:
> On 01/29/2018 11:17 PM, Andres Freund wrote:
> > On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
> >> $ llvm-config --version
> >> 5.0.0svn
> >
> > Is thta llvm-config the one in /usr/local/include/ referenced by the
> > error message above?
>
> I don't see it referenced anywhere, but it comes from here:
>
> $ which llvm-config
> /usr/local/bin/llvm-config
>
> > Or is it possible that llvm-config is from a different version than
> > the one the compiler picks the headers up from?
> >
>
> I don't think so. I don't have any other llvm versions installed, AFAICS.

Hm.


> > could you go to src/backend/lib, rm llvmjit.o, and show the full output
> > of make llvmjit.o?
> >
>
> Attached.
>
> > I wonder whether the issue is that my configure patch does
> > -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> > rather than
> > -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
> > and that it thus picks up the wrong header first?
> >
>
> I've tried this configure tweak:
>
>if test -n "$LLVM_CONFIG"; then
>  for pgac_option in `$LLVM_CONFIG --cflags`; do
>case $pgac_option in
> --I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> +-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
>esac
>  done
>
> and that indeed changes the failure to this:

Err, huh?  I don't understand how that can change anything if you
actually only have only one version of LLVM installed. Perhaps the
effect was just an ordering related artifact of [parallel] make?
I.e. just a question what failed first?


> Writing postgres.bki
> Writing schemapg.h
> Writing postgres.description
> Writing postgres.shdescription
> llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
> llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
> a member of ‘llvm’
>llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
>^~~~
> llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
> llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>llvm::remove_bad_alloc_error_handler();
>^~~~
> llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
> llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>llvm::remove_bad_alloc_error_handler();
>^~~~

It's a bit hard to interpret this without the actual compiler
invocation. But I've just checked both manually by inspecting 5.0 source
and by compiling against 5.0 that that function definition definitely
exists:

andres@alap4:~/src/llvm-5$ git branch
  master
* release_50
andres@alap4:~/src/llvm-5$ ack remove_bad_alloc_error_handler
lib/Support/ErrorHandling.cpp
139:void llvm::remove_bad_alloc_error_handler() {

include/llvm/Support/ErrorHandling.h
101:void remove_bad_alloc_error_handler();

So does my system llvm 5:
$ ack remove_bad_alloc_error_handler /usr/include/llvm-5.0/
/usr/include/llvm-5.0/llvm/Support/ErrorHandling.h
101:void remove_bad_alloc_error_handler();

But not in 4.0:
$ ack remove_bad_alloc_error_handler /usr/include/llvm-4.0/


> gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
> -Wendif-labels -Wmissing-format-attribute -Wformat-security 
> -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
> -fno-omit-frame-pointer -O2 -I../../../src/include  -D_GNU_SOURCE 
> -I/usr/local/include -DNDEBUG -DLLVM_BUILD_GLOBAL_ISEL -D_GNU_SOURCE 
> -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS   -c -o 
> llvmjit.o llvmjit.c
> llvmjit.c: In function ‘llvm_get_function’:
> llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ 
> from incompatible pointer type [-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
>  ^
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
> but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>   ^~~

To me this looks like those headers are from llvm 4, rather than 5:
$ grep -A2 -B3 LLVMOrcGetSymbolAddress ~/src/llvm-4/include/llvm-c/OrcBindings.h
/**
 * Get symbol address from JIT instance.
 */
LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
 const char *SymbolName);

$ grep -A3 -B3 LLVMOrcGetSymbolAddress ~/src/llvm-5/include/llvm-c/OrcBindings.h
/**
 * Get symbol address from JIT instance.
 */
LLVMOrcErrorCode LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
 LLVMOrcTargetAddress *RetAddr,
 const char *SymbolName);

So it does appear that your llvm-config and the actually installed llvm
don't quite agree. How did you install llvm?


Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Tomas Vondra


On 01/29/2018 11:17 PM, Andres Freund wrote:
> On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
>> On 01/29/2018 10:57 PM, Andres Freund wrote:
>>> Hi,
>>>
>>> On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
 Hi, I wanted to look at this, but my attempts to build the jit branch
 fail with some compile-time warnings (uninitialized variables) and
 errors (unknown types, incorrect number of arguments). See the file
 attached.
>>>
>>> Which git hash are you building?  What llvm version is this building
>>> against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
>>> --version return?
>>>
>>
>> I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
>> HEAD in the jit branch, AFAICS).
> 
> The warnings come from an incomplete patch I probably shouldn't have
> pushed (Heavily-WIP: JIT hashing.). They should largely be irrelevant
> (although will cause a handful of "ERROR: hm" regression failures),
> but I'll definitely pop that commit on the next rebase.  If you want you
> can just reset --hard to its parent.
> 

OK

> 
> That errors are weird however:
> 
>> ...  ^
> 
>> I'm building like this:
>>
>> $ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
>>   --with-llvm --prefix=/home/postgres/pg-llvm
>>
>> $ make -s -j4 install
>>
>> and llvm-config --version says this:
>>
>> $ llvm-config --version
>> 5.0.0svn
> 
> Is thta llvm-config the one in /usr/local/include/ referenced by the
> error message above?

I don't see it referenced anywhere, but it comes from here:

$ which llvm-config
/usr/local/bin/llvm-config

> Or is it possible that llvm-config is from a different version than
> the one the compiler picks the headers up from?
> 

I don't think so. I don't have any other llvm versions installed, AFAICS.

> could you go to src/backend/lib, rm llvmjit.o, and show the full output
> of make llvmjit.o?
> 

Attached.

> I wonder whether the issue is that my configure patch does
> -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> rather than
> -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
> and that it thus picks up the wrong header first?
> 

I've tried this configure tweak:

   if test -n "$LLVM_CONFIG"; then
 for pgac_option in `$LLVM_CONFIG --cflags`; do
   case $pgac_option in
--I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
+-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
   esac
 done

and that indeed changes the failure to this:

Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription
llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
a member of ‘llvm’
   llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
   ^~~~
llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
   llvm::remove_bad_alloc_error_handler();
   ^~~~
llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
   llvm::remove_bad_alloc_error_handler();
   ^~~~
make[3]: *** [: llvmjit_error.o] Error 1
make[2]: *** [common.mk:45: lib-recursive] Error 2
make[2]: *** Waiting for unfinished jobs
make[1]: *** [Makefile:38: all-backend-recurse] Error 2
make: *** [GNUmakefile:11: all-src-recurse] Error 2


I'm not sure what that means, though ... maybe I really have system
broken in some strange way.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Wendif-labels -Wmissing-format-attribute -Wformat-security 
-fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-fno-omit-frame-pointer -O2 -I../../../src/include  -D_GNU_SOURCE 
-I/usr/local/include -DNDEBUG -DLLVM_BUILD_GLOBAL_ISEL -D_GNU_SOURCE 
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS   -c -o 
llvmjit.o llvmjit.c
llvmjit.c: In function ‘llvm_get_function’:
llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from 
incompatible pointer type [-Wincompatible-pointer-types]
  if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
 ^
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c:239:6: error: too many arguments to function ‘LLVMOrcGetSymbolAddress’
  if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
  ^~~
In file included from llvmjit.c:45:0:

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund
On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
> On 01/29/2018 10:57 PM, Andres Freund wrote:
> > Hi,
> > 
> > On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
> >> Hi, I wanted to look at this, but my attempts to build the jit branch
> >> fail with some compile-time warnings (uninitialized variables) and
> >> errors (unknown types, incorrect number of arguments). See the file
> >> attached.
> > 
> > Which git hash are you building?  What llvm version is this building
> > against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
> > --version return?
> > 
> 
> I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
> HEAD in the jit branch, AFAICS).

The warnings come from an incomplete patch I probably shouldn't have
pushed (Heavily-WIP: JIT hashing.). They should largely be irrelevant
(although will cause a handful of "ERROR: hm" regression failures),
but I'll definitely pop that commit on the next rebase.  If you want you
can just reset --hard to its parent.


That errors are weird however:

> llvmjit.c: In function ‘llvm_get_function’:
> llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ 
> from incompatible pointer type [-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
>  ^
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
> but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>   ^~~
> llvmjit.c:239:6: error: too many arguments to function 
> ‘LLVMOrcGetSymbolAddress’
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
>   ^~~
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: declared here
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>   ^~~
> llvmjit.c:243:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ 
> from incompatible pointer type [-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt3_orc, , mangled))
>  ^

> I'm building like this:
> 
> $ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
>   --with-llvm --prefix=/home/postgres/pg-llvm
> 
> $ make -s -j4 install
> 
> and llvm-config --version says this:
> 
> $ llvm-config --version
> 5.0.0svn

Is thta llvm-config the one in /usr/local/include/ referenced by the
error message above?  Or is it possible that llvm-config is from a
different version than the one the compiler picks the headers up from?

could you go to src/backend/lib, rm llvmjit.o, and show the full output
of make llvmjit.o?

I wonder whether the issue is that my configure patch does
-I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
rather than
-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
and that it thus picks up the wrong header first?

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Tomas Vondra
On 01/29/2018 10:57 PM, Andres Freund wrote:
> Hi,
> 
> On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
>> Hi, I wanted to look at this, but my attempts to build the jit branch
>> fail with some compile-time warnings (uninitialized variables) and
>> errors (unknown types, incorrect number of arguments). See the file
>> attached.
> 
> Which git hash are you building?  What llvm version is this building
> against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
> --version return?
> 

I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
HEAD in the jit branch, AFAICS).

I'm building like this:

$ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
  --with-llvm --prefix=/home/postgres/pg-llvm

$ make -s -j4 install

and llvm-config --version says this:

$ llvm-config --version
5.0.0svn


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund
Hi,

On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
> Hi, I wanted to look at this, but my attempts to build the jit branch
> fail with some compile-time warnings (uninitialized variables) and
> errors (unknown types, incorrect number of arguments). See the file
> attached.

Which git hash are you building?  What llvm version is this building
against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
--version return?

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Tomas Vondra


On 01/24/2018 08:20 AM, Andres Freund wrote:
> Hi,
> 
> I've spent the last weeks working on my LLVM compilation patchset. In
> the course of that I *heavily* revised it. While still a good bit away
> from committable, it's IMO definitely not a prototype anymore.
> 
> There's too many small changes, so I'm only going to list the major
> things. A good bit of that is new. The actual LLVM IR emissions itself
> hasn't changed that drastically.  Since I've not described them in
> detail before I'll describe from scratch in a few cases, even if things
> haven't fully changed.
> 

Hi, I wanted to look at this, but my attempts to build the jit branch
fail with some compile-time warnings (uninitialized variables) and
errors (unknown types, incorrect number of arguments). See the file
attached.

I wonder if I'm doing something wrong, or if there's something wrong
with my environment. I do have this:

$ clang -v
clang version 5.0.0 (trunk 299717)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Writing fmgroids.h
Writing fmgrprotos.h
Writing fmgrtab.c
Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription
../../../src/include/lib/simplehash.h: In function ‘tuplehash_insert’:
execGrouping.c:428:28: warning: ‘slot’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
  econtext->ecxt_innertuple = slot;
  ~~^~
execGrouping.c:402:18: note: ‘slot’ was declared here
  TupleTableSlot *slot;
  ^~~~
../../../src/include/lib/simplehash.h: In function ‘tuplehash_lookup’:
execGrouping.c:428:28: warning: ‘slot’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
  econtext->ecxt_innertuple = slot;
  ~~^~
execGrouping.c:402:18: note: ‘slot’ was declared here
  TupleTableSlot *slot;
  ^~~~
../../../src/include/lib/simplehash.h: In function ‘tuplehash_delete’:
execGrouping.c:428:28: warning: ‘slot’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
  econtext->ecxt_innertuple = slot;
  ~~^~
execGrouping.c:402:18: note: ‘slot’ was declared here
  TupleTableSlot *slot;
  ^~~~
llvmjit.c: In function ‘llvm_get_function’:
llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from 
incompatible pointer type [-Wincompatible-pointer-types]
  if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
 ^
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c:239:6: error: too many arguments to function ‘LLVMOrcGetSymbolAddress’
  if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
  ^~~
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: declared here
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c:243:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from 
incompatible pointer type [-Wincompatible-pointer-types]
  if (LLVMOrcGetSymbolAddress(llvm_opt3_orc, , mangled))
 ^
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c:243:6: error: too many arguments to function ‘LLVMOrcGetSymbolAddress’
  if (LLVMOrcGetSymbolAddress(llvm_opt3_orc, , mangled))
  ^~~
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: declared here
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c: In function ‘llvm_compile_module’:
llvmjit.c:383:3: error: unknown type name ‘LLVMSharedModuleRef’
   LLVMSharedModuleRef smod;
   ^~~
llvmjit.c:388:10: warning: implicit declaration of function 
‘LLVMOrcMakeSharedModule’ [-Wimplicit-function-declaration]
   smod = LLVMOrcMakeSharedModule(context->module);
  ^~~
llvmjit.c:389:48: warning: passing argument 2 of ‘LLVMOrcAddEagerlyCompiledIR’ 
from incompatible pointer type [-Wincompatible-pointer-types]
   if 

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund
Hi,

On 2018-01-29 10:28:18 -0800, Jeff Davis wrote:
> OK. How about this: are you open to changes that move us in the
> direction of extensibility later? (By this I do *not* mean imposing a
> bunch of requirements on you... either small changes to your patches
> or something part of another commit.)

I'm good with that.


> Or are you determined that this always should be a part of core?

I do think JIT compilation should be in core, yes. And after quite some
looking around that currently means either using LLVM or building our
own from scratch, and the latter doesn't seem attractive. But that
doesn't mean there *also* can be extensibility. If somebody wants to
experiment with a more advanced version of JIT compilation, develop a
gcc backed version (which can't be in core due to licensing), ... - I'm
happy to provide hooks that only require a reasonable effort and don't
affect the overall stability of the system (i.e. no callback from
PostgresMain()'s sigsetjmp() block).


> I don't want to stand in your way, but I am also hesitant to dive head
> first into LLVM and not look back. Postgres has always been lean, fast
> building, and with few dependencies.

It's an optional dependency, and it doesn't increase build time that
much... If we were to move the llvm interfacing code to a .so, there'd
not even be a packaging issue, you can just package that .so separately
and get errors if somebody tries to enable LLVM without that .so being
installed.


> In other words, are you "strongly against [extensbility being a
> requirement for the first commit]" or "strongly against [extensible
> JIT]"?

I'm strongly against there not being an in-core JIT. I'm not at all
against adding APIs that allow to do different JIT implementations out
of core.


> If the source for functions is in the catalog, we could build the
> bitcode at runtime and still do the inlining. We wouldn't need to do
> anything at build time. (Again, this would be "cool stuff for the
> future", I am not asking you for it now.)

Well, the source would require an actual compiler around. And the
inlining *just* for the function code itself isn't actually that
interesting, you e.g. want to also be able to

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Jeff Davis
On Mon, Jan 29, 2018 at 1:36 AM, Andres Freund  wrote:
> There's already a *lot* of integration points in the patchseries. Error
> handling needs to happen in parts of code we do not want to make
> extensible, the defintion of expression steps has to exactly match, the
> core code needs to emit the right types for syncing, the core code needs
> to define the right FIELDNO accessors, there needs to be planner
> integrations.  Many of those aren't doable with even remotely the same
> effort, both initial and continual, from non-core code

OK. How about this: are you open to changes that move us in the
direction of extensibility later? (By this I do *not* mean imposing a
bunch of requirements on you... either small changes to your patches
or something part of another commit.) Or are you determined that this
always should be a part of core?

I don't want to stand in your way, but I am also hesitant to dive head
first into LLVM and not look back. Postgres has always been lean, fast
building, and with few dependencies. Who knows what LLVM will do in
the future and how that will affect postgres? Especially when, on day
one, we already know that it causes a few annoyances?

In other words, are you "strongly against [extensbility being a
requirement for the first commit]" or "strongly against [extensible
JIT]"?

>> > Well, but doing this outside of core would pretty much prohibit doing so
>> > forever, no?
>>
>> First of all, building .bc files at build time is much less invasive
>> than linking to the LLVM library.
>
> Could you expand on that, I don't understand why that'd be the case?

Building the .bc files at build time depends on LLVM, but is not very
version-dependent and has no impact on the resulting binary. That's
less invasive than a dependency on a library with an unstable API that
doesn't entirely work with our error reporting facility.

>> Third, there's lots of cool stuff we can do here:
>>   * put the source in the catalog
>>   * an extension could have its own catalog and build the source into
>> bitcode and cache it there
>>   * the source for functions would flow to replicas, etc.
>>   * security-conscious environments might even choose to run some of
>> the C code in a safe C interpreter rather than machine code
>
> I agree, but what does that have to do with the llvmjit stuff being an
> extension or not?

If the source for functions is in the catalog, we could build the
bitcode at runtime and still do the inlining. We wouldn't need to do
anything at build time. (Again, this would be "cool stuff for the
future", I am not asking you for it now.)

Regards,
 Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund
Hi,

On 2018-01-29 15:45:56 +0300, Konstantin Knizhnik wrote:
> On 26.01.2018 22:38, Andres Freund wrote:
> > And without it perf is not able to unwind stack trace for generated
> > > code.
> > You can work around that by using --call-graph lbr with a sufficiently
> > new perf. That'll not know function names et al, but at least the parent
> > will be associated correctly.
> 
> With --call-graph lbr result is ... slightly different (see attached
> profile) but still there is "unknown" bar.

Right. All that allows is to attribute the cost below the parent in the
perf report --children case. For it to be attributed to proper symbols
you need my llvm patch to support pef.



> Actually I am trying to find answer for the question why your version of JIT
> provides ~2 times speedup at Q1, while ISPRAS version 
> (https://www.pgcon.org/2017/schedule/attachments/467_PGCon%202017-05-26%2015-00%20ISPRAS%20Dynamic%20Compilation%20of%20SQL%20Queries%20in%20PostgreSQL%20Using%20LLVM%20JIT.pdf)
> speedup Q1 is 5.5x times.
> May be it is because them are using double type to calculate aggregates
> while as far as I understand you are using standard Postgres aggregate
> functions?
> Or may be because ISPRAS version is not checking for NULL values...

All of those together, yes. And added that I'm aiming to work
incrementally towards core inclusions, rather than getting the best
results.  There's a *lot* that can be done to improve the generated code
- after e.g. hacking together an improvement to the argument passing (by
allocating isnull / nargs / arg[], argnull[] as a separate on-stack from
FunctionCallInfoData), I get another 1.8x.  Eliminating redundant float
overflow checks gives another 1.2x. And so on.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Konstantin Knizhnik



On 26.01.2018 22:38, Andres Freund wrote:

And without it perf is not able to unwind stack trace for generated

code.

You can work around that by using --call-graph lbr with a sufficiently
new perf. That'll not know function names et al, but at least the parent
will be associated correctly.


With --call-graph lbr result is ... slightly different (see attached 
profile) but still there is "unknown" bar.



But you are compiling code using LLVMOrcAddEagerlyCompiledIR
and I find no way to pass no-omit-frame pointer option here.

It shouldn't be too hard to open code support for it, encapsulated in a
function:
 // Set function attribute "no-frame-pointer-elim" based on
 // NoFramePointerElim.
 for (auto  : *Mod) {
   auto Attrs = F.getAttributes();
   StringRef Value(options.NoFramePointerElim ? "true" : "false");
   Attrs = Attrs.addAttribute(F.getContext(), AttributeList::FunctionIndex,
  "no-frame-pointer-elim", Value);
   F.setAttributes(Attrs);
 }
that's all that option did for mcjit.


I have implemented the following function:

void
llvm_no_frame_pointer_elimination(LLVMModuleRef mod)
{
    llvm::Module *module = llvm::unwrap(mod);
    for (auto  : *module) {
        auto Attrs = F.getAttributes();
        Attrs = Attrs.addAttribute(F.getContext(), 
llvm::AttributeList::FunctionIndex,

                                   "no-frame-pointer-elim", "true");
        F.setAttributes(Attrs);
    }
}

and call it before LLVMOrcAddEagerlyCompiledIR in llvm_compile_module:

        llvm_no_frame_pointer_elimination(context->module);
        smod = LLVMOrcMakeSharedModule(context->module);

        if (LLVMOrcAddEagerlyCompiledIR(compile_orc, _handle, smod,
                                        llvm_resolve_symbol, NULL))
        {
            elog(ERROR, "failed to jit module");
        }


... but it has no effect: produced profile is the same (with 
--call-graph dwarf).

May be you can point me on my mistake...


Actually I am trying to find answer for the question why your version of 
JIT provides ~2 times speedup at Q1, while ISPRAS version 
(https://www.pgcon.org/2017/schedule/attachments/467_PGCon%202017-05-26%2015-00%20ISPRAS%20Dynamic%20Compilation%20of%20SQL%20Queries%20in%20PostgreSQL%20Using%20LLVM%20JIT.pdf)

speedup Q1 is 5.5x times.
May be it is because them are using double type to calculate aggregates 
while as far as I understand you are using standard Postgres aggregate 
functions?

Or may be because ISPRAS version is not checking for NULL values...

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Pierre Ducroquet
On Monday, January 29, 2018 10:46:13 AM CET Andres Freund wrote:
> Hi,
> 
> On 2018-01-28 23:02:56 +0100, Pierre Ducroquet wrote:
> > I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM
> > documentation is really lacking when it comes to porting from version x
> > to x+1.
> > The only really missing part I found is that in 3.9, GlobalValueSummary
> > has no flag showing if it's not EligibleToImport. I am not sure about the
> > consequences.
> 
> I think that'd not be too bad, it'd just lead to some small increase in
> overhead as more modules would be loaded.
> 
> > BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc
> > file when cleaning, and doesn't seem to install in the right folder.
> 
> Hm, both seems to be right here? Note that the llvmjit_types.bc file
> should *not* go into the bitcode/ directory, as it's about syncing types
> not inlining. I've added a comment to that effect.

The file was installed in lib/ while the code expected it in lib/postgresql. 
So there was something wrong here.
And deleting the file when cleaning is needed if at configure another llvm 
version is used. The file must be generated with a clang release that is not 
more recent than the llvm version linked to postgresql. Otherwise, the bitcode 
generated is not accepted by llvm.

Regards

 Pierre



Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund
Hi,

On 2018-01-28 23:02:56 +0100, Pierre Ducroquet wrote:
> I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM documentation 
> is 
> really lacking when it comes to porting from version x to x+1.
> The only really missing part I found is that in 3.9, GlobalValueSummary has 
> no 
> flag showing if it's not EligibleToImport. I am not sure about the 
> consequences.

I think that'd not be too bad, it'd just lead to some small increase in
overhead as more modules would be loaded.


> BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc 
> file when cleaning, and doesn't seem to install in the right folder.

Hm, both seems to be right here? Note that the llvmjit_types.bc file
should *not* go into the bitcode/ directory, as it's about syncing types
not inlining. I've added a comment to that effect.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-28 Thread Pierre Ducroquet
On Thursday, January 25, 2018 8:02:54 AM CET Andres Freund wrote:
> Hi!
> 
> On 2018-01-24 22:51:36 -0800, Jeff Davis wrote:
> > Can we store the bitcode in pg_proc, simplifying deployment and
> > allowing extensions to travel over replication?
> 
> Yes, we could. You'd need to be a bit careful that all the machines have
> similar-ish cpu generations or compile with defensive settings, but that
> seems okay.

Hi

Doing this would 'bind' the database to the LLVM release used. LLVM can, as 
far as I know, generate bitcode only for the current version, and will only be 
able to read bitcode from previous versions. So you can't have, for instance a 
master server with LLVM 5 and a standby server with LLVM 4.
So maybe PostgreSQL would have to expose what LLVM version is currently used ? 
Or a major PostgreSQL release could accept only one major LLVM release, as was 
suggested in another thread ?


 Pierre



Re: JIT compiling with LLVM v9.0

2018-01-28 Thread Pierre Ducroquet
On Thursday, January 25, 2018 8:12:42 PM CET Andres Freund wrote:
> Hi,
> 
> On 2018-01-25 10:00:14 +0100, Pierre Ducroquet wrote:
> > I don't know when this would be released,
> 
> August-October range.
> 
> > but the minimal supported LLVM
> > version will have a strong influence on the availability of that feature.
> > If today this JIT compiling was released with only LLVM 5/6 support, it
> > would be unusable for most Debian users (llvm-5 is only available in
> > sid). Even llvm 4 is not available in latest stable.
> > I'm already trying to build with llvm-4 and I'm going to try further with
> > llvm 3.9 (Debian Stretch doesn't have a more recent than this one, and I
> > won't have something better to play with my data), I'll keep you
> > informed. For sport, I may also try llvm 3.5 (for Debian Jessie).
> 
> I don't think it's unreasonable to not support super old llvm
> versions. This is a complex feature, and will take some time to
> mature. Supporting too many LLVM versions at the outset will have some
> cost.  Versions before 3.8 would require supporting mcjit rather than
> orc, and I don't think that'd be worth doing.  I think 3.9 might be a
> reasonable baseline...
> 
> Greetings,
> 
> Andres Freund

Hi

I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM documentation is 
really lacking when it comes to porting from version x to x+1.
The only really missing part I found is that in 3.9, GlobalValueSummary has no 
flag showing if it's not EligibleToImport. I am not sure about the 
consequences.
I'm still fixing some runtime issues so I will not bother you with the patch 
right now.
BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc 
file when cleaning, and doesn't seem to install in the right folder.

Regards

Pierre



Re: JIT compiling with LLVM v9.0

2018-01-27 Thread Jeff Davis
On Sat, Jan 27, 2018 at 5:15 PM, Andres Freund  wrote:
>> Also, I'm sure you considered this, but I'd like to ask if we can try
>> harder make the JIT itself happen in an extension. It has some pretty
>> huge benefits:
>
> I'm very strongly against this. To the point that I'll not pursue JITing
> further if that becomes a requirement.

I would like to see this feature succeed and I'm not making any
specific demands.

> infeasible because quite freuquently both non-JITed code and JITed code
> need adjustments. That'd solve your concern about

Can you explain further?

> I think it's a fools errand to try to keep in sync with core changes on
> the expression evaluation and struct definition side of things. There's
> planner integration, error handling integration and similar related
> things too, all of which require core changes. Therefore I don't think
> there's a reasonable chance of success of doing this outside of core
> postgres.

I wasn't suggesting the entire patch be done outside of core. Core
will certainly need to know about JIT compilation, but I am not
convinced that it needs to know about the details of LLVM. All the
references to the LLVM library itself are contained in a few files, so
you've already got it well organized. What's stopping us from putting
that code into a "jit provider" extension that implements the proper
interfaces?

> Well, but doing this outside of core would pretty much prohibit doing so
> forever, no?

First of all, building .bc files at build time is much less invasive
than linking to the LLVM library. Any version of clang will produce
bitcode that can be read by any LLVM library or tool later (more or
less).

Second, we could change our minds later. Mark any extension APIs as
experimental, and decide we want to move LLVM into postgres whenever
it is needed.

Third, there's lots of cool stuff we can do here:
  * put the source in the catalog
  * an extension could have its own catalog and build the source into
bitcode and cache it there
  * the source for functions would flow to replicas, etc.
  * security-conscious environments might even choose to run some of
the C code in a safe C interpreter rather than machine code

So I really don't see this as permanently closing off our options.

Regards,
 Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-01-27 Thread Jeff Davis
On Sat, Jan 27, 2018 at 1:20 PM, Andres Freund  wrote:
> b) The optimizations to take advantage of the constants and make the
>code faster with the constant tupledesc is fairly slow (you pretty
>much need at least an -O2 equivalent), whereas the handrolled tuple
>deforming is faster than the slot_getsomeattrs with just a single,
>pretty cheap, mem2reg pass.  We're talking about ~1ms vs 70-100ms in
>a lot of cases.  The optimizer often will not actually unroll the
>loop with many attributes despite that being beneficial.

This seems like the major point. We would have to customize the
optimization passes a lot and/or choose carefully which ones we apply.

> I think in most cases using the approach you advocate makes sense, to
> avoid duplication, but tuple deforming is such a major bottleneck that I
> think it's clearly worth doing it manually. Being able to use llvm with
> just a always-inline and a mem2reg pass makes it so much more widely
> applicable than doing the full inlining and optimization work.

OK.

On another topic, I'm trying to find a way we could break this patch
into smaller pieces. For instance, if we concentrate on tuple
deforming, maybe it would be committable in time for v11?

I see that you added some optimizations to the existing generic code.
Do those offer a measurable improvement, and if so, can you commit
those first to make the JIT stuff more readable?

Also, I'm sure you considered this, but I'd like to ask if we can try
harder make the JIT itself happen in an extension. It has some pretty
huge benefits:
  * The JIT code is likely to go through a lot of changes, and it
would be nice if it wasn't tied to a yearly release cycle.
  * Would mean postgres itself isn't dependent on a huge library like
llvm, which just seems like a good idea from a packaging standpoint.
  * May give GCC or something else a chance to compete with it's own JIT.
  * It may make it easier to get something in v11.

It appears reasonable to make the slot deforming and expression
evaluator parts an extension. execExpr.h only exports a couple new
functions; heaptuple.c has a lot of changes but they seem like they
could be separated (unless I'm missing something).

The biggest problem is that the inlining would be much harder to
separate out, because you are building the .bc files at build time. I
really like the idea of inlining, but it doesn't necessarily need to
be in the first commit.

Regards,
 Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-01-27 Thread Andres Freund
Hi,

On 2018-01-26 22:52:35 -0800, Jeff Davis wrote:
> The version of LLVM that I tried this against had a linker option
> called "InternalizeLinkedSymbols" that would prevent the visibility
> problem you mention (assuming I understand you correctly).

I don't think they're fully solvable - you can't really internalize a
reference to a mutable static variable in another translation
unit. Unless you modify that translation unit, which doesn't work when
postgres running.


> That option is no longer there so I will have to figure out how to do
> it with the current LLVM API.

Look at the llvmjit_wrap.c code invoking FunctionImporter - that pretty
much does that.  I'll push a cleaned up version of that code sometime
this weekend (it'll then live in llvmjit_inline.cpp).


> > Afaict that's effectively what I've already implemented. We could export
> > more input as constants to the generated program, but other than that...
>
> I brought this up in the context of slot_compile_deform(). In your
> patch, you have code like:
>
> +   if (!att->attnotnull)
> +   {
> ...
> +   v_nullbyte = LLVMBuildLoad(
> +   builder,
> +   LLVMBuildGEP(builder, v_bits,
> +_nullbyteno, 1, 
> ""),
> +   "attnullbyte");
> +
> +   v_nullbit = LLVMBuildICmp(
> +   builder,
> +   LLVMIntEQ,
> +   LLVMBuildAnd(builder, v_nullbyte,
> v_nullbytemask, ""),
> +   LLVMConstInt(LLVMInt8Type(), 0, false),
> +   "attisnull");
> ...
>
> So it looks like you are reimplementing the generic code, but with
> conditional code gen. If the generic code changes, someone will need
> to read, understand, and change this code, too, right?

Right. Not that that's code that has changed that much...


> With my approach, then it would initially do *un*conditional code gen,
> and be less efficient and less specialized than the code generated by
> your current patch. But then it would link in the constant tupledesc,
> and optimize, and the optimizer will realize that they are constants
> (hopefully) and then cut out a lot of the dead code and specialize it
> to the given tupledesc.

Right.


> This places a lot of faith in the optimizer and I realize it may not
> happen as nicely with real code as it did with my earlier experiments.
> Maybe you already tried and you are saying that's a dead end? I'll
> give it a shot, though.

I did that, yes. There's two major downsides:

a) The code isn't as efficient as the handrolled code. The handrolled
   code e.g. can take into account that it doesn't need to access the
   NULL bitmap for a NOT NULL column and we don't need to check the
   tuple's number of attributes if there's a following NOT NULL
   attribute. Those safe a good number of cycles.

b) The optimizations to take advantage of the constants and make the
   code faster with the constant tupledesc is fairly slow (you pretty
   much need at least an -O2 equivalent), whereas the handrolled tuple
   deforming is faster than the slot_getsomeattrs with just a single,
   pretty cheap, mem2reg pass.  We're talking about ~1ms vs 70-100ms in
   a lot of cases.  The optimizer often will not actually unroll the
   loop with many attributes despite that being beneficial.

I think in most cases using the approach you advocate makes sense, to
avoid duplication, but tuple deforming is such a major bottleneck that I
think it's clearly worth doing it manually. Being able to use llvm with
just a always-inline and a mem2reg pass makes it so much more widely
applicable than doing the full inlining and optimization work.


> >> I experimented a bit before and it works for basic cases, but I'm not
> >> sure if it's as good as your hand-generated LLVM.
> >
> > For deforming it doesn't even remotely get as good in my experiments.
>
> I'd like some more information here -- what didn't work? It didn't
> recognize constants? Or did recognize them, but didn't optimize as
> well as you did by hand?

It didn't optimize as well as I did by hand, without significantly
complicating (and slowing) the originating the code. It sometimes
decided not to unroll the loop, and it takes a *lot* longer than the
direct emission of the code.

I'm hoping to work on making more of the executor JITed, and there I do
think it's largely going to be what you're proposing, due to the sheer
mass of code.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-26 Thread Jeff Davis
Hi,

On Fri, Jan 26, 2018 at 6:40 PM, Andres Freund  wrote:
>> I would like to see if we can get a combination of JIT and LTO to work
>> together to specialize generic code at runtime.
>
> Well, LTO can't quite work. It relies on being able to mark code in
> modules linked together as externally visible - and cleary we can't do
> that for a running postgres binary. At least in all incarnations I'm
> aware of.  But that's why the tree I posted supports inlining of code.

I meant a more narrow use of LTO: since we are doing linking in step
#4 and optimization in step #5, it's optimizing the code after
linking, which is a kind of LTO (though perhaps I'm misusing the
term?).

The version of LLVM that I tried this against had a linker option
called "InternalizeLinkedSymbols" that would prevent the visibility
problem you mention (assuming I understand you correctly). That option
is no longer there so I will have to figure out how to do it with the
current LLVM API.

> Afaict that's effectively what I've already implemented. We could export
> more input as constants to the generated program, but other than that...

I brought this up in the context of slot_compile_deform(). In your
patch, you have code like:

+   if (!att->attnotnull)
+   {
...
+   v_nullbyte = LLVMBuildLoad(
+   builder,
+   LLVMBuildGEP(builder, v_bits,
+_nullbyteno, 1, ""),
+   "attnullbyte");
+
+   v_nullbit = LLVMBuildICmp(
+   builder,
+   LLVMIntEQ,
+   LLVMBuildAnd(builder, v_nullbyte,
v_nullbytemask, ""),
+   LLVMConstInt(LLVMInt8Type(), 0, false),
+   "attisnull");
...

So it looks like you are reimplementing the generic code, but with
conditional code gen. If the generic code changes, someone will need
to read, understand, and change this code, too, right?

With my approach, then it would initially do *un*conditional code gen,
and be less efficient and less specialized than the code generated by
your current patch. But then it would link in the constant tupledesc,
and optimize, and the optimizer will realize that they are constants
(hopefully) and then cut out a lot of the dead code and specialize it
to the given tupledesc.

This places a lot of faith in the optimizer and I realize it may not
happen as nicely with real code as it did with my earlier experiments.
Maybe you already tried and you are saying that's a dead end? I'll
give it a shot, though.

> Now the JITed expressions tree currently makes it hard for LLVM to
> recognize some constant input as constant, but what's largely needed for
> that to be better is some improvements in where temporary values are
> stored (should be in alloca's rather than local memory, so mem2reg can
> do its thing).  It's a TODO... Right now LLVM will figure out constant
> inputs to non-strict functions, but not strict ones, but after fixing
> some of what I've mentioned previously it works pretty universally.
>
>
> Have I misunderstood adn there's some significant functional difference?

I'll try to explain with code, and then we can know for sure ;-)

Sorry for the ambiguity, I'm probably misusing a few terms.

>> I experimented a bit before and it works for basic cases, but I'm not
>> sure if it's as good as your hand-generated LLVM.
>
> For deforming it doesn't even remotely get as good in my experiments.

I'd like some more information here -- what didn't work? It didn't
recognize constants? Or did recognize them, but didn't optimize as
well as you did by hand?

Regards,
  Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-01-26 Thread Jeff Davis
On Wed, Jan 24, 2018 at 11:02 PM, Andres Freund  wrote:
> Not entirely sure what you mean. You mean why I don't inline
> slot_getsomeattrs() etc and instead generate code manually?  The reason
> is that the generated code is a *lot* smarter due to knowing the
> specific tupledesc.

I would like to see if we can get a combination of JIT and LTO to work
together to specialize generic code at runtime.

Let's say you have a function f(int x, int y, int z). You want to be
able to specialize it on y at runtime, so that a loop gets unrolled in
the common case where y is small.

1. At build time, create bitcode for the generic implementation of f().
2. At run time, load the generic bitcode into a module (let's call it
the "generic module")
3. At run time, create a new module (let's call it the "bind module")
that only does the following things:
   a. declares a global variable bind_y, and initialize it to the value 3
   b. declares a wrapper function f_wrapper(int x, int z), and all the
function does is call f(x, bind_y, z)
4. Link the generic module and the bind module together (let's call
the result the "linked module")
5. Optimize the linked module

After sorting out a few details about symbols and inlining, what will
happen is that the generic f() will be inlined into f_wrapper, and it
will see that bind_y is a constant, and then unroll a "for" loop over
y.

I experimented a bit before and it works for basic cases, but I'm not
sure if it's as good as your hand-generated LLVM.

If we can make this work, it would be a big win for
readability/maintainability. The hand-generated LLVM is limited to the
bind module, which is very simple, and doesn't need to be changed when
the implementation of f() changes.

Regards,
 Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-01-26 Thread Andres Freund
Hi,

On 2018-01-26 13:06:27 +0300, Konstantin Knizhnik wrote:
> One more question: do you have any idea how to profile JITed code?

Yes ;). It depends a bit on what exactly you want to do. Is it
sufficient to get time associated with the parent caller, or do you need
instruction-level access.


> There is no LLVMOrcRegisterPerf in LLVM 5, so jit_profiling_support option
> does nothing.

Right, it's a patch I'm trying to get into the next version of
llvm. With that you get access to the shared object and everything.


> And without it perf is not able to unwind stack trace for generated
> code.

You can work around that by using --call-graph lbr with a sufficiently
new perf. That'll not know function names et al, but at least the parent
will be associated correctly.


> But you are compiling code using LLVMOrcAddEagerlyCompiledIR
> and I find no way to pass no-omit-frame pointer option here.

It shouldn't be too hard to open code support for it, encapsulated in a
function:
// Set function attribute "no-frame-pointer-elim" based on
// NoFramePointerElim.
for (auto  : *Mod) {
  auto Attrs = F.getAttributes();
  StringRef Value(options.NoFramePointerElim ? "true" : "false");
  Attrs = Attrs.addAttribute(F.getContext(), AttributeList::FunctionIndex,
 "no-frame-pointer-elim", Value);
  F.setAttributes(Attrs);
}
that's all that option did for mcjit.

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-26 Thread David Fetter
On Thu, Jan 25, 2018 at 11:20:28AM -0800, Andres Freund wrote:
> On 2018-01-25 18:40:53 +0300, Konstantin Knizhnik wrote:
> > Another question is whether it is sensible to redundantly do
> > expensive work (llvm compilation) in all backends.
> 
> Right now we kinda have to, but I really want to get rid of that.
> There's some pointers included as constants in the generated code. I
> plan to work on getting rid of that requirement, but after getting
> the basics in (i.e. realistically not this release).  Even after
> that I'm personally much more interested in caching the generated
> code inside a backend, rather than across backends.   Function
> addresses et al being different between backends would add some
> complications, can be overcome, but I'm doubtful it's immediately
> worth it.

If we go with threading for this part, sharing that state may be
simpler.  It seems a lot of work is going into things that threading
does at a much lower developer cost, but that's a different
conversation.

> > So before starting code generation, ExecReadyCompiledExpr can first
> > build signature and check if correspondent library is already present.
> > Also it will be easier to control space used by compiled libraries in
> > this
> 
> Right, I definitely think we want to do that at some point not too far
> away in the future. That makes the applicability of JITing much broader.
> 
> More advanced forms of this are that you JIT in the background for
> frequently executed code (so not to incur latency the first time
> somebody executes). Aand/or that you emit unoptimized code the first
> time through, which is quite quick, and run the optimizer after the
> query has been executed a number of times.

Both sound pretty neat.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: JIT compiling with LLVM v9.0

2018-01-26 Thread Konstantin Knizhnik



On 26.01.2018 11:23, Andres Freund wrote:

Hi,

Thanks for testing things out!


Thank you for this work.
One more question: do you have any idea how to profile JITed code?
There is no LLVMOrcRegisterPerf in LLVM 5, so jit_profiling_support 
option does nothing.

And without it perf is not able to unwind stack trace for generated code.
A attached the produced profile, looks like "unknown" bar corresponds to 
JIT code.


There is NoFramePointerElim option in LLVMMCJITCompilerOptions 
structure, but it requires use of ExecutionEngine.

Something like this:

    mod = llvm_mutable_module(context);
    {
        struct LLVMMCJITCompilerOptions options;
        LLVMExecutionEngineRef jit;
        char* error;
        LLVMCreateExecutionEngineForModule(, mod, );
        LLVMInitializeMCJITCompilerOptions(, sizeof(options));
        options.NoFramePointerElim = 1;
        LLVMCreateMCJITCompilerForModule(, mod, , 
sizeof(options),

                                         );
    }
    ...

But you are compiling code using LLVMOrcAddEagerlyCompiledIR
and I find no way to pass no-omit-frame pointer option here.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: JIT compiling with LLVM v9.0

2018-01-26 Thread Andres Freund
Hi,

Thanks for testing things out!


On 2018-01-26 10:44:24 +0300, Konstantin Knizhnik wrote:
> Also I noticed that parallel execution didsables JIT.

Oh, oops, I broke that recently by moving where the decisition about
whether to jit or not is. There actually is JITing, but only in the
leader.


> Are there any principle problems with combining JIT and parallel execution?

No, there's not, I just need to send down the flag to JIT down to the
workers. Will look at it tomorrow.  If you want to measure / play around
till then you can manually hack the PGJIT_* checks in execExprCompile.c

with that done, on my laptop, tpch-Q01, scale 10:

SET max_parallel_workers_per_gather=0; SET jit_expressions = 1;
15145.508 ms
SET max_parallel_workers_per_gather=0; SET jit_expressions = 0;
23808.809 ms
SET max_parallel_workers_per_gather=4; SET jit_expressions = 1;
4775.170 ms
SET max_parallel_workers_per_gather=4; SET jit_expressions = 0;
7173.483 ms

(that's with inlining and deforming enabled too)

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-25 Thread Konstantin Knizhnik

Hi,

I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.




Below are results on my system for Q1 TPC-H scale 10 (~13Gb database)

Options
Time
Default
20075
jit_expressions=on
16105
jit_tuple_deforming=on  14734
jit_perform_inlining=on
13441


Also I noticed that parallel execution didsables JIT.
At my computer with 4 cores time of Q1 with parallel execution is 6549.
Are there any principle problems with combining JIT and parallel execution?


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: JIT compiling with LLVM v9.0

2018-01-25 Thread Andres Freund
Hi,

On 2018-01-25 10:00:14 +0100, Pierre Ducroquet wrote:
> I don't know when this would be released,

August-October range.


> but the minimal supported LLVM 
> version will have a strong influence on the availability of that feature. If 
> today this JIT compiling was released with only LLVM 5/6 support, it would be 
> unusable for most Debian users (llvm-5 is only available in sid). Even llvm 4 
> is not available in latest stable.
> I'm already trying to build with llvm-4 and I'm going to try further with 
> llvm 
> 3.9 (Debian Stretch doesn't have a more recent than this one, and I won't 
> have 
> something better to play with my data), I'll keep you informed. For sport, I 
> may also try llvm 3.5 (for Debian Jessie).

I don't think it's unreasonable to not support super old llvm
versions. This is a complex feature, and will take some time to
mature. Supporting too many LLVM versions at the outset will have some
cost.  Versions before 3.8 would require supporting mcjit rather than
orc, and I don't think that'd be worth doing.  I think 3.9 might be a
reasonable baseline...

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-25 Thread Konstantin Knizhnik



On 24.01.2018 10:20, Andres Freund wrote:

Hi,

I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.

There's too many small changes, so I'm only going to list the major
things. A good bit of that is new. The actual LLVM IR emissions itself
hasn't changed that drastically.  Since I've not described them in
detail before I'll describe from scratch in a few cases, even if things
haven't fully changed.


== JIT Interface ==

To avoid emitting code in very small increments (increases mmap/mremap
rw vs exec remapping, compile/optimization time), code generation
doesn't happen for every single expression individually, but in batches.

The basic object to emit code via is a jit context created with:
   extern LLVMJitContext *llvm_create_context(bool optimize);
which in case of expression is stored on-demand in the EState. For other
usecases that might not be the right location.

To emit LLVM IR (ie. the portabe code that LLVM then optimizes and
generates native code for), one gets a module from that with:
   extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);

to which "arbitrary" numbers of functions can be added. In case of
expression evaluation, we get the module once for every expression, and
emit one function for the expression itself, and one for every
applicable/referenced deform function.

As explained above, we do not want to emit code immediately from within
ExecInitExpr()/ExecReadyExpr(). To facilitate that readying a JITed
expression sets the function to callback, which gets the actual native
function on the first actual call.  That allows to batch together the
generation of all native functions that are defined before the first
expression is evaluated - in a lot of queries that'll be all.

Said callback then calls
   extern void *llvm_get_function(LLVMJitContext *context, const char 
*funcname);
which'll emit code for the "in progress" mutable module if necessary,
and then searches all generated functions for the name. The names are
created via
   extern void *llvm_get_function(LLVMJitContext *context, const char 
*funcname);
currently "evalexpr" and deform" with a generation and counter suffix.

Currently expression which do not have access to an EState, basically
all "parent" less expressions, aren't JIT compiled. That could be
changed, but I so far do not see a huge need.


Hi,

As far as I understand generation of native code is now always done for 
all supported expressions and individually by each backend.
I wonder it will be useful to do more efforts to understand when 
compilation to native code should be done and when interpretation is better.
For example many JIT-able languages like Lua are using traces, i.e. 
query is first interpreted  and trace is generated. If the same trace is 
followed more than N times, then native code is generated for it.


In context of DBMS executor it is obvious that only frequently executed 
or expensive queries have to be compiled.
So we can use estimated plan cost and number of query executions as 
simple criteria for JIT-ing the query.
May be compilation of simple queries (with small cost) should be done 
only for prepared statements...


Another question is whether it is sensible to redundantly do expensive 
work (llvm compilation) in all backends.
This question refers to shared prepared statement cache. But even 
without such cache, it seems to be possible to use for library name some 
signature of the compiled expression and allow
to share this libraries between backends. So before starting code 
generation, ExecReadyCompiledExpr can first build signature and check if 
correspondent library is already present.
Also it will be easier to control space used by compiled libraries in 
this case.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: JIT compiling with LLVM v9.0

2018-01-25 Thread Pierre Ducroquet
On Thursday, January 25, 2018 7:38:16 AM CET Andres Freund wrote:
> Hi,
> 
> On 2018-01-24 22:33:30 -0800, Jeff Davis wrote:
> > On Wed, Jan 24, 2018 at 1:35 PM, Pierre Ducroquet  
wrote:
> > > In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only
> > > as a C ++ API in llvm/IR/DebugInfo.h.
> > 
> > The LLVM APIs don't seem to be very stable; won't there just be a
> > continuous stream of similar issues?
> 
> There'll be some of that yes. But the entire difference between 5 and
> what will be 6 was not including one header, and not calling one unneded
> function. That doesn't seem like a crazy amount of adaption that needs
> to be done.  From a quick look about porting to 4, it'll be a bit, but
> not much more effort.

I don't know when this would be released, but the minimal supported LLVM 
version will have a strong influence on the availability of that feature. If 
today this JIT compiling was released with only LLVM 5/6 support, it would be 
unusable for most Debian users (llvm-5 is only available in sid). Even llvm 4 
is not available in latest stable.
I'm already trying to build with llvm-4 and I'm going to try further with llvm 
3.9 (Debian Stretch doesn't have a more recent than this one, and I won't have 
something better to play with my data), I'll keep you informed. For sport, I 
may also try llvm 3.5 (for Debian Jessie).

 Pierre





Re: JIT compiling with LLVM v9.0

2018-01-24 Thread Jeff Davis
On Tue, Jan 23, 2018 at 11:20 PM, Andres Freund  wrote:
> Hi,
>
> I've spent the last weeks working on my LLVM compilation patchset. In
> the course of that I *heavily* revised it. While still a good bit away
> from committable, it's IMO definitely not a prototype anymore.

Great!

A couple high-level questions:

1. I notice a lot of use of the LLVM builder, for example, in
slot_compile_deform(). Why can't you do the same thing you did with
function code, where you create the ".bc" at build time from plain C
code, and then load it at runtime?
2. I'm glad you considered extensions. How far can we go with this in
the future? Can we have bitcode-only extensions that don't need a .so
file? Can we store the bitcode in pg_proc, simplifying deployment and
allowing extensions to travel over replication? I am not asking for
this now, of course, but I'd like to get the idea out there so we
leave room.

Regards,
 Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-01-24 Thread Jeff Davis
On Wed, Jan 24, 2018 at 1:35 PM, Pierre Ducroquet  wrote:
> In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
> ++ API in llvm/IR/DebugInfo.h.

The LLVM APIs don't seem to be very stable; won't there just be a
continuous stream of similar issues?

Pinning major postgresql versions to specific LLVM versions doesn't
seem very appealing. Even if you aren't interested in the latest
changes in LLVM, trying to get the right version on your machine will
be annoying.

Regards,
Jeff Davis



Re: JIT compiling with LLVM v9.0

2018-01-24 Thread Andres Freund
Hi,

On 2018-01-24 14:06:30 -0800, Andres Freund wrote:
> > In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as 
> > a C
> > ++ API in llvm/IR/DebugInfo.h.
> 
> Hm, I compiled against 5.0 quite recently, but added the stripping of
> debuginfo lateron.  I'll add a fallback method, thanks for pointing that
> out!

Went more with your fix, there's not much point in using the C API
here. Should probably remove the use of it nearly entirely from the .cpp
file (save for wrap/unwrap() use). But man, the 'class Error' usage is
one major ugly pain.


> > But I still could not build because the LLVM API changed between 5.0 and 
> > 6.0 
> > regarding value info SummaryList. 
> 
> Hm, thought these changes were from before my 5.0 test. But the code
> evolved heavily, so I might misremember. Let me see.

Ah, that one was actually easier to fix. There's no need to get the base
object at all, so it's just a one-line change.


> Thanks, I'll try to push fixes into the tree soon-ish..

Pushed.

Thanks again for looking!

- Andres



Re: JIT compiling with LLVM v9.0

2018-01-24 Thread Pierre Ducroquet
On Wednesday, January 24, 2018 8:20:38 AM CET Andres Freund wrote:
> As the patchset is large (500kb) and I'm still quickly evolving it, I do
> not yet want to attach it. The git tree is at
>   https://git.postgresql.org/git/users/andresfreund/postgres.git
> in the jit branch
>  
> https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shor
> tlog;h=refs/heads/jit
> 
> to build --with-llvm has to be passed to configure, llvm-config either
> needs to be in PATH or provided with LLVM_CONFIG to make. A c++ compiler
> and clang need to be available under common names or provided via CXX /
> CLANG respectively.
> 
> Regards,
> 
> Andres Freund

Hi

I tried to build on Debian sid, using GCC 7 and LLVM 5. I used the following 
to compile, using your branch @3195c2821d :

$ export LLVM_CONFIG=/usr/bin/llvm-config-5.0
$ ./configure --with-llvm
$ make

And I had the following build error :
llvmjit_wrap.cpp:32:10: fatal error: llvm-c/DebugInfo.h: No such file or 
directory
 #include "llvm-c/DebugInfo.h"
  ^~~~
compilation terminated.

In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
++ API in llvm/IR/DebugInfo.h.

For 'sport' (I have not played with LLVM API since more than one year), I 
tried to fix it, changing it to the C++ include.

The DebugInfo related one was easy, only one function was used.
But I still could not build because the LLVM API changed between 5.0 and 6.0 
regarding value info SummaryList. 

llvmjit_wrap.cpp: In function 
‘std::unique_ptr > > 
llvm_build_inline_plan(llvm::Module*)’:
llvmjit_wrap.cpp:285:48: error: ‘class llvm::GlobalValueSummary’ has no member 
named ‘getBaseObject’
fs = llvm::cast(gvs->getBaseObject());
^

That one was a bit uglier.

I'm not sure how to test everything properly, so the patch is attached for 
both these issues, do as you wish with it… :)

Regards

 Pierre Ducroquet

>From fdfea09dd7410d6ed7ad54df1ba3092bd0eecb92 Mon Sep 17 00:00:00 2001
From: Pierre Ducroquet 
Date: Wed, 24 Jan 2018 22:28:34 +0100
Subject: [PATCH] Allow building with LLVM 5.0

---
 src/backend/lib/llvmjit_wrap.cpp | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/backend/lib/llvmjit_wrap.cpp b/src/backend/lib/llvmjit_wrap.cpp
index b745aec4fe..7961148a85 100644
--- a/src/backend/lib/llvmjit_wrap.cpp
+++ b/src/backend/lib/llvmjit_wrap.cpp
@@ -29,7 +29,6 @@ extern "C"
 
 #include "llvm-c/Core.h"
 #include "llvm-c/BitReader.h"
-#include "llvm-c/DebugInfo.h"
 
 #include 
 #include 
@@ -50,6 +49,7 @@ extern "C"
 #include "llvm/Analysis/ModuleSummaryAnalysis.h"
 #include "llvm/Bitcode/BitcodeReader.h"
 #include "llvm/IR/CallSite.h"
+#include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/ModuleSummaryIndex.h"
 #include "llvm/Linker/IRMover.h"
@@ -218,6 +218,13 @@ llvm_inline(LLVMModuleRef M)
 	llvm_execute_inline_plan(mod, globalsToInline.get());
 }
 
+
+inline llvm::GlobalValueSummary *GlobalValueSummary__getBaseObject(llvm::GlobalValueSummary *gvs) {
+  if (auto *AS = llvm::dyn_cast(gvs))
+return >getAliasee();
+  return gvs;
+}
+
 /*
  * Build information necessary for inlining external function references in
  * mod.
@@ -282,7 +289,7 @@ llvm_build_inline_plan(llvm::Module *mod)
 			const llvm::Module *defMod;
 			llvm::Function *funcDef;
 
-			fs = llvm::cast(gvs->getBaseObject());
+			fs = llvm::cast(GlobalValueSummary__getBaseObject(gvs.get()));
 			elog(DEBUG2, "func %s might be in %s",
  funcName.data(),
  modPath.data());
@@ -476,7 +483,7 @@ load_module(llvm::StringRef Identifier)
 	 * code. Until that changes, not much point in wasting memory and cycles
 	 * on processing debuginfo.
 	 */
-	LLVMStripModuleDebugInfo(mod);
+	llvm::StripDebugInfo(*llvm::unwrap(mod));
 
 	return std::unique_ptr(llvm::unwrap(mod));
 }
-- 
2.15.1



JIT compiling with LLVM v9.0

2018-01-23 Thread Andres Freund
Hi,

I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.

There's too many small changes, so I'm only going to list the major
things. A good bit of that is new. The actual LLVM IR emissions itself
hasn't changed that drastically.  Since I've not described them in
detail before I'll describe from scratch in a few cases, even if things
haven't fully changed.


== JIT Interface ==

To avoid emitting code in very small increments (increases mmap/mremap
rw vs exec remapping, compile/optimization time), code generation
doesn't happen for every single expression individually, but in batches.

The basic object to emit code via is a jit context created with:
  extern LLVMJitContext *llvm_create_context(bool optimize);
which in case of expression is stored on-demand in the EState. For other
usecases that might not be the right location.

To emit LLVM IR (ie. the portabe code that LLVM then optimizes and
generates native code for), one gets a module from that with:
  extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);

to which "arbitrary" numbers of functions can be added. In case of
expression evaluation, we get the module once for every expression, and
emit one function for the expression itself, and one for every
applicable/referenced deform function.

As explained above, we do not want to emit code immediately from within
ExecInitExpr()/ExecReadyExpr(). To facilitate that readying a JITed
expression sets the function to callback, which gets the actual native
function on the first actual call.  That allows to batch together the
generation of all native functions that are defined before the first
expression is evaluated - in a lot of queries that'll be all.

Said callback then calls
  extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
which'll emit code for the "in progress" mutable module if necessary,
and then searches all generated functions for the name. The names are
created via
  extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
currently "evalexpr" and deform" with a generation and counter suffix.

Currently expression which do not have access to an EState, basically
all "parent" less expressions, aren't JIT compiled. That could be
changed, but I so far do not see a huge need.


== Error handling ==

There's two aspects to error handling.

Firstly, generated (LLVM IR) and emitted functions (mmap()ed segments)
need to be cleaned up both after a successful query execution and after
an error.  I've settled on a fairly boring resowner based mechanism. On
errors all expressions owned by a resowner are released, upon success
expressions are reassigned to the parent / released on commit (unless
executor shutdown has cleaned them up of course).


A second, less pretty and newly developed, aspect of error handling is
OOM handling inside LLVM itself. The above resowner based mechanism
takes care of cleaning up emitted code upon ERROR, but there's also the
chance that LLVM itself runs out of memory. LLVM by default does *not*
use any C++ exceptions. It's allocations are primarily funneled through
the standard "new" handlers, and some direct use of malloc() and
mmap(). For the former a 'new handler' exists
http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
latter LLVM provides callback that get called upon failure
(unfortunately mmap() failures are treated as fatal rather than OOM
errors).
What I've chosen to do, and I'd be interested to get some input about
that, is to have two functions that LLVM using code must use:
  extern void llvm_enter_fatal_on_oom(void);
  extern void llvm_leave_fatal_on_oom(void);
before interacting with LLVM code (ie. emitting IR, or using the above
functions) llvm_enter_fatal_on_oom() needs to be called.

When a libstdc++ new or LLVM error occurs, the handlers set up by the
above functions trigger a FATAL error. We have to use FATAL rather than
ERROR, as we *cannot* reliably throw ERROR inside a foreign library
without risking corrupting its internal state.

Users of the above sections do *not* have to use PG_TRY/CATCH blocks,
the handlers instead are reset on toplevel sigsetjmp() level.


Using a relatively small enter/leave protected section of code, rather
than setting up these handlers globally, avoids negative interactions
with extensions that might use C++ like e.g. postgis. As LLVM code
generation should never execute arbitrary code, just setting these
handlers temporarily ought to suffice.


== LLVM Interface / patches ==

Unfortunately a bit of required LLVM functionality, particularly around
error handling but also initialization, aren't currently fully exposed
via LLVM's C-API.  A bit more *optional* API isn't exposed either.

Instead of requiring a brand-new version of LLVM that has exposed this
functionality I decided it's better to have a