Re: [iovisor-dev] minutes: IO Visor TSC/Dev Meeting

2019-04-05 Thread Jiong Wang

Great, will have a look, thanks!

Regards,
Jiong

On 05/04/2019 07:45, Y Song wrote:

Hi, Jiong,

To follow up the iovisor meeting discussion, the below is my prototype
for an end_loop
instruction in llvm:
https://github.com/yonghong-song/llvm/commit/b83226772100317092cae6478229ed6ca3b9903c
The goal is to help verifier to just focus on these marked cases,
rejecting any other backedges.

Please let me and John know if you get some better idea for bounded
loop support in llvm/verifier.

Thanks,

Yonghong

On Wed, Apr 3, 2019 at 11:49 AM Brenden Blanco  wrote:

Hi All,

Thanks for the good discussion today! Below are my notes.

Thanks,
Brenden

=== Discussion ===

[...]

Yonghong:
* Compile once run anywhere work continues
   * bitfield handling bugs in IR/debuginfo

Daniel:
* Global support work continues
   * BTF side patches submitted to bpf mailing list
   * tests included

Jiong:
* 32 bit patch set
   * test methodology improvements
   * updated patches later in the week
   * some concerns around shifts, to be addressed in later improvements

Andrii:
* BTF and compile-once work integration
   to share prototype tool with Saeed

Brendan:
* Is there a tool to measure queue latency in qdisc->netdev layer?
   * debian/ubuntu are packaging bpftrace
   * except libbcc renamed to libbpf_cc
   * some issues with mixing iovisor's libbcc and debian's

Jesper:
* Fedora adding packaging support for libbpf

Alexei:
* systemd also adding support for libbpf - link to be provided?

=== Attendees ===
Brenden Blanco
Andril Nakryiko
Daniel Borkmann
Jesper Brouer
Jiong Wang
Marco Leogrande
Michael Savisko
Paul Chaignon
Quentin Monnet
Alexei Starovoitov
Saeed
Flavio
Rony
Jonathan Lemon
Brendan Gregg
Dan Siemon
Joe Stringer
John





-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1645): https://lists.iovisor.org/g/iovisor-dev/message/1645
Mute This Topic: https://lists.iovisor.org/mt/30527411/21656
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [iovisor-dev] math between pkt pointer and register with unbounded min value is not allowed #verifier

2019-03-27 Thread Jiong Wang


> On 27 Mar 2019, at 16:43, Simon  wrote:
> 
> Thx a lot for your time Jiong.
> 
> The more I played with bpf/xdp, the more I understand that the challenge is 
> about making "optimized byte code" compliant for the verifier.
> 
> How could I do this kind of checks my self ? I mean looking how llvm 
> optimized my code ? (to be able to do same kind of analyses you do above?)

Just my humble opinion, I would recommend:
   
  1.  get used to verifier rejection information, for example:  
 

 
  R0=inv1 R1=pkt(id=0,off=0,r=42,imm=0) R2=pkt_end(id=0,off=0,imm=0) 
R3=inv(id=0) R4=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) R5=inv5 
R10=fp0,call_-1
40: (0f) r1 += r3   
math between pkt pointer and register with unbounded min value is not allowed

 
  It tells you the status of each registers at the rejection point,   
  for example, now R3 is “inv”, meaning a scalar value (not a pointer), 
  
  and is without value range, then r4 has value range, and maximum value
 
  is 504. 

 
  2. known what verifier will reject. Could refer to: 

 
 
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/verifier?id=473c5daa86ffe91e937856cc32b4faa61db2e3e3

 
 those are unit examples of what will be rejected, and some of them are 
with 
 meaningful test name or comments so could be easy to understand. 

 
Regards,
 
Jiong 
 


> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1632): https://lists.iovisor.org/g/iovisor-dev/message/1632
Mute This Topic: https://lists.iovisor.org/mt/30285987/21656
Mute #verifier: https://lists.iovisor.org/mk?hashtag=verifier=2590197
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [iovisor-dev] math between pkt pointer and register with unbounded min value is not allowed #verifier

2019-03-27 Thread Jiong Wang


> On 27 Mar 2019, at 16:11, Jiong Wang via Lists.Iovisor.Org 
>  wrote:
> 
>> 
>> On 27 Mar 2019, at 14:53, Simon  wrote:
>> 
>> Hi Jiong,
>>   I didn't succeed to generate .i file using bcc, but since severals days I 
>> try to rewrite my code without bcc. (directly with bpf C api / clang / 
>> iproute2)
>> 
>>   I didn't finished yet, but I have a reduced version compared to the one I 
>> written for bcc and I face the same issue, so this time I can get .i file 
>> easily.
>> 
>>   So the .i file is as attachment.
>>   The corresponding code is available here : 
>> https://github.com/sbernard31/udploadbalancer/blob/44fe1ea549a55ab23c7d1b70e9651df6f61fb865/ulb.c
> 
> 
> Hi Simon,
> 
>  Thanks for the .i, I prototyped some byteswap code-gen change, but
> seems doesn’t help your issue which could narrow down to the following
> general code pattern:
> 
> unsigned char cal(unsigned int a, unsigned char *b)   
> { 
>  if (a < 8 || a > 512)   
>return 0; 
> 
>  return b[a]; 
>   
> } 
> 
> LLVM is doing some optimisation, instead of generating two separate 
> comparison,
> a < 8 and a > 512, it is combining them because a negative value when casted
> into unsigned must be bigger than 504, so above code turned into
> 
> unsigned char cal(unsigned int a, unsigned char *b)   
> {
>  unsigned tmp = a - 8; 
>  if (tmp > 504)
>return 0; 
> 
>  return b[a]; 
>   
> } 
> 
> The consequence of such optimisation is new variable “tmp” is used for 
> comparison
> And verifier now know “tmp”'s value range instead of the original “a” which 
> is used
> later adding to a packet pointer. A unknown value range of “a” then caused the
> verifier rejection. 
> 
> So, I suspect any code using above c code pattern will likely be rejected.
> 
>  1. combinable comparisons
>  2. the variable involved in the comparison used later in places requiring 
> value range

And in your code, after you insert those printk, they made the following
two comparisons non-combinable any more, so udp_len is used for the
comparison and got correct value range to pass the later pkt pointer
addition check.


if (udp_len < 8) {
return XDP_DROP;
}
if (udp_len > 512) {
return XDP_DROP;
}

> 
> Regards,
> Jiong
> 
>> 
>>   The error is still  math between pkt pointer and register with unbounded 
>> min value is not allowed
>> 
>>   The verifier output is pretty much the same :
>> 
>> 27: (71) r4 = *(u8 *)(r1 +23)
>> 28: (b7) r0 = 2
>> 29: (55) if r4 != 0x11 goto pc+15
>> R0=inv2 R1=pkt(id=0,off=0,r=34,imm=0) R2=pkt_end(id=0,off=0,imm=0) 
>> R3=pkt(id=0,off=34,r=34,imm=0) R4=inv17 R5=inv5 R10=fp0,call_-1
>> 30: (07) r3 += 8
>> 31: (b7) r0 = 1
>> 32: (2d) if r3 > r2 goto pc+12
>> R0=inv1 R1=pkt(id=0,off=0,r=42,imm=0) R2=pkt_end(id=0,off=0,imm=0) 
>> R3=pkt(id=0,off=42,r=42,imm=0) R4=inv17 R5=inv5 R10=fp0,call_-1
>> 33: (69) r3 = *(u16 *)(r1 +38)
>> 34: (dc) r3 = be16 r3
>> 35: (bf) r4 = r3
>> 36: (07) r4 += -8
>> 37: (57) r4 &= 65535
>> 38: (b7) r0 = 1
>> 39: (25) if r4 > 0x1f8 goto pc+5
>> R0=inv1 R1=pkt(id=0,off=0,r=42,imm=0) R2=pkt_end(id=0,off=0,imm=0) 
>> R3=inv(id=0) R4=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) R5=inv5 
>> R10=fp0,call_-1
>> 40: (0f) r1 += r3
>> math between pkt pointer and register with unbounded min value is not allowed
>> 
>> I'm pretty sure the bcc version I used before linked statically clang/llvm 
>> v7.0.
>> Here I use a v6.0.
>> 
>> The funny part ...  This modification about just adding some logs/printk 
>> makes this error disappear : 
>> https://github.com/sbernard31/udploadbalancer/commit/0145538c7b35e2a6bb92225f69a45f4bee120a6d
>> 
>> All of those erifier errors make me a bit crazy (╥_╥)
>> 
>> HTH
>> 
>> Simon
>> 
>> 
>> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1630): https://lists.iovisor.org/g/iovisor-dev/message/1630
Mute This Topic: https://lists.iovisor.org/mt/30285987/21656
Mute #verifier: https://lists.iovisor.org/mk?hashtag=verifier=2590197
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [iovisor-dev] math between pkt pointer and register with unbounded min value is not allowed #verifier

2019-03-27 Thread Jiong Wang


> On 27 Mar 2019, at 14:53, Simon  wrote:
> 
> Hi Jiong,
>I didn't succeed to generate .i file using bcc, but since severals days I 
> try to rewrite my code without bcc. (directly with bpf C api / clang / 
> iproute2)
> 
>I didn't finished yet, but I have a reduced version compared to the one I 
> written for bcc and I face the same issue, so this time I can get .i file 
> easily.
> 
>So the .i file is as attachment.
>The corresponding code is available here : 
> https://github.com/sbernard31/udploadbalancer/blob/44fe1ea549a55ab23c7d1b70e9651df6f61fb865/ulb.c


Hi Simon,

  Thanks for the .i, I prototyped some byteswap code-gen change, but
seems doesn’t help your issue which could narrow down to the following
general code pattern:

unsigned char cal(unsigned int a, unsigned char *b)   
{ 
  if (a < 8 || a > 512)   
return 0; 

 
  return b[a];  
 
} 

LLVM is doing some optimisation, instead of generating two separate comparison,
a < 8 and a > 512, it is combining them because a negative value when casted
into unsigned must be bigger than 504, so above code turned into

unsigned char cal(unsigned int a, unsigned char *b)   
{
  unsigned tmp = a - 8; 
  if (tmp > 504)
return 0; 

 
  return b[a];  
 
} 

The consequence of such optimisation is new variable “tmp” is used for 
comparison
And verifier now know “tmp”'s value range instead of the original “a” which is 
used
later adding to a packet pointer. A unknown value range of “a” then caused the
verifier rejection. 

So, I suspect any code using above c code pattern will likely be rejected.

  1. combinable comparisons
  2. the variable involved in the comparison used later in places requiring 
value range

Regards,
Jiong

> 
>The error is still  math between pkt pointer and register with unbounded 
> min value is not allowed
> 
>The verifier output is pretty much the same :
> 
> 27: (71) r4 = *(u8 *)(r1 +23)
> 28: (b7) r0 = 2
> 29: (55) if r4 != 0x11 goto pc+15
>  R0=inv2 R1=pkt(id=0,off=0,r=34,imm=0) R2=pkt_end(id=0,off=0,imm=0) 
> R3=pkt(id=0,off=34,r=34,imm=0) R4=inv17 R5=inv5 R10=fp0,call_-1
> 30: (07) r3 += 8
> 31: (b7) r0 = 1
> 32: (2d) if r3 > r2 goto pc+12
>  R0=inv1 R1=pkt(id=0,off=0,r=42,imm=0) R2=pkt_end(id=0,off=0,imm=0) 
> R3=pkt(id=0,off=42,r=42,imm=0) R4=inv17 R5=inv5 R10=fp0,call_-1
> 33: (69) r3 = *(u16 *)(r1 +38)
> 34: (dc) r3 = be16 r3
> 35: (bf) r4 = r3
> 36: (07) r4 += -8
> 37: (57) r4 &= 65535
> 38: (b7) r0 = 1
> 39: (25) if r4 > 0x1f8 goto pc+5
>  R0=inv1 R1=pkt(id=0,off=0,r=42,imm=0) R2=pkt_end(id=0,off=0,imm=0) 
> R3=inv(id=0) R4=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) R5=inv5 
> R10=fp0,call_-1
> 40: (0f) r1 += r3
> math between pkt pointer and register with unbounded min value is not allowed
> 
> I'm pretty sure the bcc version I used before linked statically clang/llvm 
> v7.0.
> Here I use a v6.0.
> 
> The funny part ...  This modification about just adding some logs/printk 
> makes this error disappear : 
> https://github.com/sbernard31/udploadbalancer/commit/0145538c7b35e2a6bb92225f69a45f4bee120a6d
> 
> All of those erifier errors make me a bit crazy (╥_╥)
> 
> HTH
> 
> Simon
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1629): https://lists.iovisor.org/g/iovisor-dev/message/1629
Mute This Topic: https://lists.iovisor.org/mt/30285987/21656
Mute #verifier: https://lists.iovisor.org/mk?hashtag=verifier=2590197
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [iovisor-dev] math between pkt pointer and register with unbounded min value is not allowed #verifier

2019-03-07 Thread Jiong Wang


Yonghong Song writes:

> On Wed, Mar 6, 2019 at 7:08 AM  wrote:
>>
>> I'm playing with bcc to prototype an UDP load balancer.
>>
>> I'm facing an issue that I didn't succeed to understand...
>>
>> In my code I tried to validate my UDP packet using code like this :
>>
>> struct udphdr *udp;
>> udp = iph + 1;
>> if (udp + 1 > data_end)
>> return XDP_DROP;
>> __u16 udp_len = bpf_ntohs(udp->len);
>> //__u16 udp_len = 8;
>> if (udp_len < 8)
>> return XDP_DROP;
>> if (udp_len > 512) // TODO use a more approriate max value
>> return XDP_DROP;
>> if ((void *) udp + udp_len > data_end)
>> return XDP_DROP;
>>
>> And the verifier does not like it ..
>
> This is caused by compiler optimizations.
>
>>
>> 28: (71) r2 = *(u8 *)(r7 +23)
>> 29: (b7) r0 = 2
>> 30: (55) if r2 != 0x11 goto pc+334
>>  R0=inv2 R1=pkt_end(id=0,off=0,imm=0) R2=inv17 R3=inv5 
>> R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=34,imm=0) 
>> R8=pkt(id=0,off=34,r=34,imm=0) R9=pkt(id=0,off=14,r=34,imm=0) R10=fp0,call_-1
>> 31: (bf) r2 = r8
>> 32: (07) r2 += 8
>> 33: (b7) r0 = 1
>> 34: (2d) if r2 > r1 goto pc+330
>>  R0=inv1 R1=pkt_end(id=0,off=0,imm=0) R2=pkt(id=0,off=42,r=42,imm=0) R3=inv5 
>> R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=42,imm=0) 
>> R8=pkt(id=0,off=34,r=42,imm=0) R9=pkt(id=0,off=14,r=42,imm=0) R10=fp0,call_-1
>> 35: (69) r3 = *(u16 *)(r7 +38)
>> 36: (dc) r3 = be16 r3
>
> r3 get the value from memory, its value could be any one as permitted
> by the type.
>
>> 37: (bf) r2 = r3
>> 38: (07) r2 += -8
>> 39: (57) r2 &= 65535
>> 40: (b7) r0 = 1 
>> 41: (25) if r2 > 0x1f8 goto pc+323
>
> test is done by r2. We indeed get better range for r2 (below:
> R2=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) )
> but r3 range is not tightened.

I had run into similar issue when debugging some other rejection before
JMP32 introduced when LLVM was generating similar sequences under defult
64-bit mode, but IIRC LLVM generates betweer sequences with -mattr=alu32,
under which it will just use w3 (as the type should be optimized into
32-bit) for the comparison.

So, I guess this testcase could have easier sequence for verifier under
ALU32 mode. But for this case, BPF_END is used which doesn't have
sub-register code-gen support inside LLVM for be16 and be32 at the moment
(noticed this several days ago when doing some other benchmarking).

If I have .i file, I could do a quick prototype to see if ALU32 could
improve this.

Regards,
Jiong

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1601): https://lists.iovisor.org/g/iovisor-dev/message/1601
Mute This Topic: https://lists.iovisor.org/mt/30285987/21656
Mute #verifier: https://lists.iovisor.org/mk?hashtag=verifier=2590197
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [iovisor-dev] minutes: IO Visor TSC/Dev Meeting

2019-01-24 Thread Jiong Wang

On 24/01/2019 02:33, Brenden Blanco wrote:

Jiong:
* respin jmp32 patch set - v2
   - implementation in working state
   - isa for subregister looking good
   - codegen giving smaller results (num of instructions)
   - will resend patch set after fixing test failure


Finished re-spin and rebased on top of latest bpf-next.

My set will potentially conflict with the following three other bpf-next
pending patches/sets on selftests Makefile and test_verifier.c:
  - Stanislav's LLVM_READELF change on Makefile.
  - Alexei's spin_lock change on test_verifier.c.
  - Jakub's test_verifier.c refactor on test_verifier.c.

The conflict should be easy to resolve. For test_verifier.c, my set just
adds new tests at the end of the test array. I will send out v3 shortly.

Regards,
Jiong


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1558): https://lists.iovisor.org/g/iovisor-dev/message/1558
Mute This Topic: https://lists.iovisor.org/mt/29521656/21656
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [iovisor-dev] minutes: IO Visor TSC/Dev Meeting

2018-05-17 Thread Jiong Wang via iovisor-dev

On 17/05/2018 05:38, Brenden Blanco via iovisor-dev wrote:

Hi All,

Thanks for attending the meeting today. As usual, here are my notes.

=== Discussion ===
Bjorn
- AF_XDP zero copy implementation RFC on netdev
- some uapi concerns, may need to back out changes
- but, over next couple weeks before merge window, discuss and improve in
  hopes of merging
- clean up non-intel code, respin, leave out i40e patches at the end
- separate i40e cleanup spin
- bcm may try to join as 2nd vendor to flesh out uapi
- some discussion of common code organization
- xdp in separate file helps backporting

Yonghong
- some python3 fixes
- proposals for improved container support
- rewriter complexity increasing
- tracing event introspection

Martin
- BTF id landed
- working on bpftool support
- still using pahole to convert dwarf
- still unclear how integration with llvm will look

Jiong
- dominator tree info RFCs pushed
- significant mem/exec time costs to run (e.g. 2x more, 3x slower)


Just for the record, if I listened correctly, John proposed to introduce
a switch for the new CFG infrastructure, i.e, we keep the exisiting DFS
based check_cfg, the new CFG code will only be enabled when check_cfg
detected a loop and wants know whether it is bounded, by this we could
avoid extra mem/exec time cost on programs that don't have loop.

Regards,
Jiong


John
- induction variable tracking in progress
- next: min/max value tracking
both: to work on topic branch under bpf-next

William
- tried workaround from Yonghong
- still need to dig into root cause

Joe
- implementation of sock lookup going through iterations
- concerns over locking in bpf being a leaky implementation detail


=== Attendees ===
Brenden Blanco
Alexei Starovoitov
Bjorn Topel
Daniel Borkmann
Jesper Brouer
Jiong Wang
Magnus Karlsson
Quentin Monnet
Joe Stringer
Andy Gospodarek
Ferris
Brendan Gregg
Jakub Kicinski
William Tu
Yonghong Song
Alexander Duyck
Sandipan
John F
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [PATCH, bpf-llvm] Fix bug on silently truncating 64-bit immediate

2017-10-12 Thread Jiong Wang via iovisor-dev

On 12/10/2017 12:04, Jesper Dangaard Brouer wrote:

On Thu, 12 Oct 2017 06:21:30 -0400
Jiong Wang <jiong.w...@netronome.com> wrote:


We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding.  This caused comparison with wrong value.

I think you send this to the wrong mailing list... this looks like a
patch against the LLVM source code.

Shouldn't you send to: llvm-...@lists.llvm.org ?


Hi Jesper,

  I am thinking the users & developers of eBPF llvm are quite focused in
xdp-newbies and iovisor-dev, so if the code is related with user visible
effect, for example bug fix, and if the code change is straightforward
and small, I tend to just send it to these two so related user could be
aware of it in case they are encountering the same problem, and then this
patch could be posted to llvm review site (https://reviews.llvm.org).

  

This bug looks to be introduced by r308080.

This looks like a very recent change:
   https://llvm.org/viewvc/llvm-project?view=revision=308080
   Sat Jul 15 05:41:42 2017 UTC (2 months, 4 weeks ago)

(As you are sending this to a user mailing list: xdp-newb...@vger.kernel.org)

I want to know if this have made it into a LLVM release? and which release?


This has not been made into any LLVM official release.
The bug may show up if the user is trying to use distributor snapshop release.



As the git-repo[1] replica of LLVM SVN-repo does not git-tag the
releases, I could not answer this question myself via the command:

$ git describe --contains 7c423e0690

[1] http://llvm.org/git/llvm.git


The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.

Reported-by: Jakub Kicinski <jakub.kicin...@netronome.com>
Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
  lib/Target/BPF/BPFISelLowering.cpp |  8 ++--
  lib/Target/BPF/BPFInstrInfo.td |  2 +-
  test/CodeGen/BPF/select_ri.ll  | 35 +++
  3 files changed, 42 insertions(+), 3 deletions(-)


--

Regards,
Jiong

___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


[iovisor-dev] [PATCH, bpf-llvm] Fix bug on silently truncating 64-bit immediate

2017-10-12 Thread Jiong Wang via iovisor-dev
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding.  This caused comparison with wrong value.

This bug looks to be introduced by r308080.  The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.

Reported-by: Jakub Kicinski <jakub.kicin...@netronome.com>
Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 lib/Target/BPF/BPFISelLowering.cpp |  8 ++--
 lib/Target/BPF/BPFInstrInfo.td |  2 +-
 test/CodeGen/BPF/select_ri.ll  | 35 +++
 3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/lib/Target/BPF/BPFISelLowering.cpp 
b/lib/Target/BPF/BPFISelLowering.cpp
index d4e06dd..995f206 100644
--- a/lib/Target/BPF/BPFISelLowering.cpp
+++ b/lib/Target/BPF/BPFISelLowering.cpp
@@ -611,11 +611,15 @@ 
BPFTargetLowering::EmitInstrWithCustomInserter(MachineInstr ,
 .addReg(LHS)
 .addReg(MI.getOperand(2).getReg())
 .addMBB(Copy1MBB);
-  else
+  else {
+int64_t imm32 = MI.getOperand(2).getImm();
+// sanity check before we build J*_ri instruction.
+assert (isInt<32>(imm32));
 BuildMI(BB, DL, TII.get(NewCC))
 .addReg(LHS)
-.addImm(MI.getOperand(2).getImm())
+.addImm(imm32)
 .addMBB(Copy1MBB);
+  }
 
   // Copy0MBB:
   //  %FalseValue = ...
diff --git a/lib/Target/BPF/BPFInstrInfo.td b/lib/Target/BPF/BPFInstrInfo.td
index fcd6a60..a3ad2ee 100644
--- a/lib/Target/BPF/BPFInstrInfo.td
+++ b/lib/Target/BPF/BPFInstrInfo.td
@@ -460,7 +460,7 @@ let usesCustomInserter = 1 in {
   (ins GPR:$lhs, i64imm:$rhs, i64imm:$imm, GPR:$src, 
GPR:$src2),
   "# Select PSEUDO $dst = $lhs $imm $rhs ? $src : $src2",
   [(set i64:$dst,
-   (BPFselectcc i64:$lhs, (i64 imm:$rhs), (i64 imm:$imm), 
i64:$src, i64:$src2))]>;
+   (BPFselectcc i64:$lhs, (i64immSExt32:$rhs), (i64 
imm:$imm), i64:$src, i64:$src2))]>;
 }
 
 // load 64-bit global addr into register
diff --git a/test/CodeGen/BPF/select_ri.ll b/test/CodeGen/BPF/select_ri.ll
index b802b64..7b1f852 100644
--- a/test/CodeGen/BPF/select_ri.ll
+++ b/test/CodeGen/BPF/select_ri.ll
@@ -25,3 +25,38 @@ entry:
 }
 
 attributes #0 = { norecurse nounwind readonly }
+
+; test immediate out of 32-bit range
+; Source file:
+
+; unsigned long long
+; load_word(void *buf, unsigned long long off)
+; asm("llvm.bpf.load.word");
+;
+; int
+; foo(void *buf)
+; {
+;  unsigned long long sum = 0;
+;
+;  sum += load_word(buf, 100);
+;  sum += load_word(buf, 104);
+;
+;  if (sum != 0x1ULL)
+;return ~0U;
+;
+;  return 0;
+;}
+
+; Function Attrs: nounwind readonly
+define i32 @foo(i8*) local_unnamed_addr #0 {
+  %2 = tail call i64 @llvm.bpf.load.word(i8* %0, i64 100)
+  %3 = tail call i64 @llvm.bpf.load.word(i8* %0, i64 104)
+  %4 = add i64 %3, %2
+  %5 = icmp ne i64 %4, 8589934591
+; CHECK:  r{{[0-9]+}} = 8589934591 ll
+  %6 = sext i1 %5 to i32
+  ret i32 %6
+}
+
+; Function Attrs: nounwind readonly
+declare i64 @llvm.bpf.load.word(i8*, i64) #1
-- 
2.7.4

___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support

2017-09-19 Thread Jiong Wang via iovisor-dev

On 18/09/2017 22:29, Daniel Borkmann wrote:

On 09/18/2017 10:47 PM, Jiong Wang wrote:

Hi,

   Currently, LLVM eBPF backend always generate code in 64-bit mode, 
this may

cause troubles when JITing to 32-bit targets.

   For example, it is quite common for XDP eBPF program to access 
some packet
fields through base + offset that the default eBPF will generate 
BPF_ALU64 for
the address formation, later when JITing to 32-bit hardware, 
BPF_ALU64 needs
to be expanded into 32 bit ALU sequences even though the address 
space is

32-bit that the high bits is not significant.

   While a complete 32-bit mode implemention may need an new ABI 
(something like
-target-abi=ilp32), this patch set first add some initial code so we 
could

construct 32-bit eBPF tests through hand-written assembly.

   A new 32-bit register set is introduced, its name is with "w" 
prefix and LLVM
assembler will encode statements like "w1 += w2" into the following 
8-bit code

field:

 BPF_ADD | BPF_X | BPF_ALU

BPF_ALU will be used instead of BPF_ALU64.

   NOTE, currently you can only use "w" register with ALU statements, 
not with
others like branches etc as they don't have different encoding for 
32-bit

target.


Great to see work in this direction! Can we also enable to use / emit
all the 32bit BPF_ALU instructions whenever possible for the currently
available bpf targets while at it (which only use BPF_ALU64 right now)?


Hi Daniel,

   Thanks for the feedback.

   I think we could also enable the use of all the 32bit BPF_ALU under 
currently
available bpf targets.  As we now have 32bit register set support, we 
could make
i32 type as legal type to prevent it be promoted into i64, then hook it 
up with i32

ALU patterns, will look into this.

Regards,
Jiong
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [PATCH RFC 3/4] New 32-bit register set

2017-09-19 Thread Jiong Wang via iovisor-dev

On 19/09/2017 07:44, Y Song wrote:

Hi, Jiong,

Thanks for the patch! It is a great start to support 32bit register in BPF.
In the past, I have studied a little bit to see whether 32bit register
support may reduce
the number of unnecessary shifts on x86_64 and improve the
performance. Looking through
a few bpf programs and it looks like the opportunity is not great, but
still nice to have if we
have this capability. As you mentioned, this definitely helped 32bit
architecture!

I am not an expert in LLVM tablegen. I briefly looked through the code
change and it looks like
correct. Hopefully some llvm-dev tablegen experts can comment on the
implementation.

Below I only have a couple of minor comments.


Yong Hong,

  Thanks for the review.

  I have addressed your comments and attached the updated patch.

  Do you want me to put this patch set on to llvm review website? I 
guess it is the

  formal review channel?

Regards,
Jiong

Acked-by: Jakub Kicinski <jakub.kicin...@netronome.com>
Signed-off-by: Jiong Wang <jiong.w...@netronome.com>

diff --git a/lib/Target/BPF/BPFRegisterInfo.td 
b/lib/Target/BPF/BPFRegisterInfo.td
index c8e24f8..da1d6b5 100644
--- a/lib/Target/BPF/BPFRegisterInfo.td
+++ b/lib/Target/BPF/BPFRegisterInfo.td
@@ -11,31 +11,42 @@
 //  Declarations that describe the BPF register file
 
//===--===//
 
+let Namespace = "BPF" in {
+  def sub_32 : SubRegIndex<32>;
+}
+
+class Wi<bits<16> Enc, string n> : Register {
+  let HWEncoding = Enc;
+  let Namespace = "BPF";
+}
+
 // Registers are identified with 4-bit ID numbers.
 // Ri - 64-bit integer registers
-class Ri<bits<16> Enc, string n> : Register {
-  let Namespace = "BPF";
+class Ri<bits<16> Enc, string n, list subregs>
+  : RegisterWithSubRegs<n, subregs> {
   let HWEncoding = Enc;
+  let Namespace = "BPF";
+  let SubRegIndices = [sub_32];
 }
 
-// Integer registers
-def R0 : Ri< 0, "r0">, DwarfRegNum<[0]>;
-def R1 : Ri< 1, "r1">, DwarfRegNum<[1]>;
-def R2 : Ri< 2, "r2">, DwarfRegNum<[2]>;
-def R3 : Ri< 3, "r3">, DwarfRegNum<[3]>;
-def R4 : Ri< 4, "r4">, DwarfRegNum<[4]>;
-def R5 : Ri< 5, "r5">, DwarfRegNum<[5]>;
-def R6 : Ri< 6, "r6">, DwarfRegNum<[6]>;
-def R7 : Ri< 7, "r7">, DwarfRegNum<[7]>;
-def R8 : Ri< 8, "r8">, DwarfRegNum<[8]>;
-def R9 : Ri< 9, "r9">, DwarfRegNum<[9]>;
-def R10 : Ri<10, "r10">, DwarfRegNum<[10]>;
-def R11 : Ri<11, "r11">, DwarfRegNum<[11]>;
+foreach I = 0-11 in {
+  // 32-bit Integer (alias to low part of 64-bit register).
+  def W#I  : Wi<I,  "w"#I>,  DwarfRegNum<[I]>;
+  // 64-bit Integer registers
+  def R#I  : Ri<I,  "r"#I,  [!cast("W"#I)]>,  DwarfRegNum<[I]>;
+}
 
 // Register classes.
-def GPR : RegisterClass<"BPF", [i64], 64, (add R1, R2, R3, R4, R5,
-   R6, R7, R8, R9, // callee saved
-   R0,  // return value
-   R11, // stack ptr
-   R10  // frame ptr
-  )>;
+def GPR32 : RegisterClass<"BPF", [i32], 32, (add
+  (sequence "W%u", 1, 9),
+  W0, // Return value
+  W11, // Stack Ptr
+  W10  // Frame Ptr
+)>;
+
+def GPR : RegisterClass<"BPF", [i64], 64, (add
+  (sequence "R%u", 1, 9),
+  R0, // Return value
+  R11, // Stack Ptr
+  R10  // Frame Ptr
+)>;
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support

2017-09-19 Thread Jiong Wang via iovisor-dev

On 19/09/2017 05:49, Fulvio Risso wrote:

Dear Jiong,

that's a great work.
I havent' gone through the whole patches, but it seems to me that the 
documentation is not that much.
From my past experiences, putting your hands into a compiler without 
at least some high-level documentation that presents how it works, it 
would be a nightmare.


Hi Fulvio,

  Thanks for the feedback.

  I agree that we need some high-level documentation on this. It could let
people evaluating the implementation approach taken and we could make sure
things on on correct direction.

  I was dividing the 32-bit BPF support into the following three parts and
I am thinking the support of the first part could let the user be able to
construct testcases contaning 32-bit ALU which is good for experimenting
and it won't affect code-gen from .c/.ll. Aslo this part is not hard for
review and relatively independent, so I have seperated them out into this
patch set for review.

  Detailed implementation details or high-level design will be included in
the cover letter of patch set for the second part which is the actually
part for 32-bit code-gen enablement.

brief intro on 32-bit BPF support phases
===
* Support in assembly level.

  This includes cleanup of existing BPF instruction patterns to make them
  32-bit support friendly and also the registration of 32-bit register set.

* Full code-gen Support in LLVM.

  This includes making 32-bit integer type as legal type and associate it
  with 32-bit register class, also we need to cleanup operation legalization
  stragegies.

  We need to discuss whether we want to make 32-bit BPF an new target or
  just an new environment in LLVM's concept, so the user could simply enable
  it through something like --triple=bpf-unknown-elf-ilp32. We may also need
  new ELF attributes and new relocation types for map address relocation for
  32-bit eBPF.

* Hook Clang driver with LLVM.

  Support for this is straightforward, especially if we go with new 
environment
  approach.

Regards,

Jiong

___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [v2][PATCH RFC] Add BPF AsmParser support in LLVM

2017-09-12 Thread Jiong Wang via iovisor-dev

On Tue, Sep 12, 2017 at 11:10 AM, Y Song  wrote:

Thanks, Jiong,

The patch looks good. I have applied to llvm trunk.
https://reviews.llvm.org/rL313055


Thanks, Yonghong.




Hmm, I actually feel there is no need to offer an separate syntax for
64bit imm
encoding, it might be better to offer a unified single syntax "r =
signed_imm"
to the user and the encoder (BPFMCCodeEmitter::encodeInstruction)
guarantee to
choose optimal encoding.

Encoder could choose BPF_ALU64 | BPF_MOV | BPF_K for both "r1 = -1" and
"r1 = -1ll" while only resorting to BPF_LD | BPF_DW | BPF_IMM when the imm
really
doesn't fit into 32bit (!isInt<32>(imm_opnd)), for example, "r1 =
0x1ll".


Right now, this optimization actually
has been done in compiler side. So depending on the value, the compiler will
use
"move r1, <32_bit_value>" or "ld_imm64 r1, ".
In this case, it make sense to have different insn formats for two different
kinds of
insns.

What you described is to move the optimization down to MC Emit Code stage,
based on
the value, it can choose to use "move" or "ld_imm64" insn. So optimization
will be available
for .c => .o and for .s => .o. (The current scheme, optimization not
available from .s => .o).

Yes, it is possible. But typically, there is one-to-one mapping between asm
insn to insn encoding.
If later on, we found the complain about this, we can revisit this issue.


OK.

Regards,
Jiong

___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [v2][PATCH RFC] Add BPF AsmParser support in LLVM

2017-09-12 Thread Jiong Wang via iovisor-dev



- Adjusted constant load tests.
- Interleaved the expected results with input insns.

  * Changed the disassembly output "ll" to "LL" in BPFInstPrinter.cpp.
This is because I noticed MC parser only accept "LL" as long long
suffix.


Sorry for the confusion caused by "ll". I tested your patch and found that
both r = 5 and r = 5LL generated ld_imm64 insn. This is not what compiler
will do. Further debugging I found that this is due to my change to remove
"ll" from asmstring in the insn definition (.td file). Since the assembler is
patten matching based on asmstring. "r = 5" matched to ld_imm64 as well.

I just fixed the issue and restored " ll" suffix for ld_imm64 asmstring.
You can then restore your previous change to add "ll" as a valid identifier
in the middle of asm statement.


Hmm, I actually feel there is no need to offer an separate syntax for 64bit imm
encoding, it might be better to offer a unified single syntax "r = signed_imm"
to the user and the encoder (BPFMCCodeEmitter::encodeInstruction) guarantee to
choose optimal encoding.

Encoder could choose BPF_ALU64 | BPF_MOV | BPF_K for both "r1 = -1" and
"r1 = -1ll" while only resorting to BPF_LD | BPF_DW | BPF_IMM when the imm 
really
doesn't fit into 32bit (!isInt<32>(imm_opnd)), for example, "r1 = 
0x1ll".

Anyway, patch updated, changes are:

  - added "ll" back.
  - updated the test cases for BPF_MOV | BPF_K in ALU64 class and
BPF_IMM | BPF_DW in LD class.
  - Dropped the change from "ll" to "LL". If we decided to move to "LL" which is
in consistent with LLVM generic parser then could clean this up as a
separate patch.

Please review, thanks.

Regards,
Jiong

Reviewed-by: Yonghong Song <y...@fb.com>
Signed-off-by: Jiong Wang <jiong.w...@netronome.com>

---
 include/llvm/MC/MCParser/MCTargetAsmParser.h |   2 +
 lib/MC/MCParser/AsmParser.cpp|   5 +
 lib/Target/BPF/AsmParser/BPFAsmParser.cpp| 472 +++
 lib/Target/BPF/AsmParser/CMakeLists.txt  |   3 +
 lib/Target/BPF/AsmParser/LLVMBuild.txt   |  23 ++
 lib/Target/BPF/CMakeLists.txt|   1 +
 lib/Target/BPF/LLVMBuild.txt |   2 +-
 test/MC/BPF/insn-unit.s  | 168 ++
 test/MC/BPF/lit.local.cfg|   3 +
 9 files changed, 678 insertions(+), 1 deletion(-)
 create mode 100644 lib/Target/BPF/AsmParser/BPFAsmParser.cpp
 create mode 100644 lib/Target/BPF/AsmParser/CMakeLists.txt
 create mode 100644 lib/Target/BPF/AsmParser/LLVMBuild.txt
 create mode 100644 test/MC/BPF/insn-unit.s
 create mode 100644 test/MC/BPF/lit.local.cfg

diff --git a/include/llvm/MC/MCParser/MCTargetAsmParser.h b/include/llvm/MC/MCParser/MCTargetAsmParser.h
index 6d76d51..e5d5a2a 100644
--- a/include/llvm/MC/MCParser/MCTargetAsmParser.h
+++ b/include/llvm/MC/MCParser/MCTargetAsmParser.h
@@ -266,6 +266,8 @@ public:
   virtual bool equalIsAsmAssignment() { return true; };
   // Return whether this start of statement identifier is a label
   virtual bool isLabel(AsmToken ) { return true; };
+  // Return whether this parser accept star as start of statement
+  virtual bool starIsStartOfStatement() { return false; };
 
   virtual const MCExpr *applyModifierToExpr(const MCExpr *E,
 MCSymbolRefExpr::VariantKind,
diff --git a/lib/MC/MCParser/AsmParser.cpp b/lib/MC/MCParser/AsmParser.cpp
index 03858d3..ee3cc8d 100644
--- a/lib/MC/MCParser/AsmParser.cpp
+++ b/lib/MC/MCParser/AsmParser.cpp
@@ -1687,6 +1687,11 @@ bool AsmParser::parseStatement(ParseStatementInfo ,
 // Treat '}' as a valid identifier in this context.
 Lex();
 IDVal = "}";
+  } else if (Lexer.is(AsmToken::Star) &&
+ getTargetParser().starIsStartOfStatement()) {
+// Accept '*' as a valid start of statement.
+Lex();
+IDVal = "*";
   } else if (parseIdentifier(IDVal)) {
 if (!TheCondState.Ignore) {
   Lex(); // always eat a token
diff --git a/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
new file mode 100644
index 000..d00200c
--- /dev/null
+++ b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -0,0 +1,472 @@
+//===-- BPFAsmParser.cpp - Parse BPF assembly to MCInst instructions --===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringSwitch.h"
+#include "llvm/MC/MCContext.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCInst.

[iovisor-dev] [v2][PATCH RFC] Add BPF AsmParser support in LLVM

2017-09-11 Thread Jiong Wang via iovisor-dev

On 09/09/17 00:47, Y Song wrote:

On Fri, Sep 8, 2017 at 1:00 PM, Jiong Wang <jiong.w...@netronome.com> wrote:


Hi Y Song,

  Thanks for the review and test.

[snip]


I guess, we may need to add something like LD_VAR_ADDR so that
the proper assembly code can be issued. Also we could drop "ll" with
"r1 = tx_port"?



Personally, I prefer drop "ll". The "ll" suffix was there to tell people it is 
64bit
constant while the register "r1" is with 64-bit width so I feel it is enough. 
For
32-bit situation, we need new register set, some thing like "w1 = tx_port".


I just push a patch (similar to your suggestion below but without
assertion) which still has
"ll" suffix for the constant, but no "ll" suffix for symbols. The
reason is that  we use "ll"
in the asm printout to differentiate "r = 5"=>"mov r, 5" (32bit imm) and
"r = 5ll" => "ld_imm64 r, 5ll" (64bit imm).

Also, for long long constant, C standard does not allow space between
value and "ll" and
hence "567 ll" is not allowed to represent a number "567ll". Could you
try to disallow
64bit immediate like "567 ll" as well in your patch?


Hi Y Song,

Thanks, patch updated, please review.

Changes since the initial version:

  * Addressed all comments on instruction unit tests.
- Renamed test file to "insn-unit.s".
- Removed unnecessary comments.
- Added BPF_B tests for load and store.
- Added BPF_K test for jmp and alu.
  (Noticed a seperate issue with unsigned compare then jump. Looks like BPF
   backend is alway printing the immediate as signed, I guess we need to
   change the operand types in instruction patterns.)
- Adjusted constant load tests.
- Interleaved the expected results with input insns.

  * Changed the disassembly output "ll" to "LL" in BPFInstPrinter.cpp.
This is because I noticed MC parser only accept "LL" as long long suffix.
It might be the reason documented here:

  
https://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=19759250

Reviewed-by: Yonghong Song <y...@fb.com>
Signed-off-by: Jiong Wang <jiong.w...@netronome.com>

---
 include/llvm/MC/MCParser/MCTargetAsmParser.h  |   2 +
 lib/MC/MCParser/AsmParser.cpp |   5 +
 lib/Target/BPF/AsmParser/BPFAsmParser.cpp | 471 ++
 lib/Target/BPF/AsmParser/CMakeLists.txt   |   3 +
 lib/Target/BPF/AsmParser/LLVMBuild.txt|  23 ++
 lib/Target/BPF/CMakeLists.txt |   1 +
 lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp |   2 +-
 lib/Target/BPF/LLVMBuild.txt  |   2 +-
 test/MC/BPF/insn-unit.s   | 164 +
 test/MC/BPF/lit.local.cfg |   3 +
 10 files changed, 674 insertions(+), 2 deletions(-)
 create mode 100644 lib/Target/BPF/AsmParser/BPFAsmParser.cpp
 create mode 100644 lib/Target/BPF/AsmParser/CMakeLists.txt
 create mode 100644 lib/Target/BPF/AsmParser/LLVMBuild.txt
 create mode 100644 test/MC/BPF/insn-unit.s
 create mode 100644 test/MC/BPF/lit.local.cfg

diff --git a/include/llvm/MC/MCParser/MCTargetAsmParser.h b/include/llvm/MC/MCParser/MCTargetAsmParser.h
index 6d76d51..e5d5a2a 100644
--- a/include/llvm/MC/MCParser/MCTargetAsmParser.h
+++ b/include/llvm/MC/MCParser/MCTargetAsmParser.h
@@ -266,6 +266,8 @@ public:
   virtual bool equalIsAsmAssignment() { return true; };
   // Return whether this start of statement identifier is a label
   virtual bool isLabel(AsmToken ) { return true; };
+  // Return whether this parser accept star as start of statement
+  virtual bool starIsStartOfStatement() { return false; };
 
   virtual const MCExpr *applyModifierToExpr(const MCExpr *E,
 MCSymbolRefExpr::VariantKind,
diff --git a/lib/MC/MCParser/AsmParser.cpp b/lib/MC/MCParser/AsmParser.cpp
index 03858d3..ee3cc8d 100644
--- a/lib/MC/MCParser/AsmParser.cpp
+++ b/lib/MC/MCParser/AsmParser.cpp
@@ -1687,6 +1687,11 @@ bool AsmParser::parseStatement(ParseStatementInfo ,
 // Treat '}' as a valid identifier in this context.
 Lex();
 IDVal = "}";
+  } else if (Lexer.is(AsmToken::Star) &&
+ getTargetParser().starIsStartOfStatement()) {
+// Accept '*' as a valid start of statement.
+Lex();
+IDVal = "*";
   } else if (parseIdentifier(IDVal)) {
 if (!TheCondState.Ignore) {
   Lex(); // always eat a token
diff --git a/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
new file mode 100644
index 000..a36d8b1
--- /dev/null
+++ b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -0,0 +1,471 @@
+//===-- BPFAsmParser.cpp - Parse BPF assembly to MCInst instructions --===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is

[iovisor-dev] [PATCH RFC] Add BPF AsmParser support in LLVM

2017-09-07 Thread Jiong Wang via iovisor-dev

Hi,

  As discussed at threads:

https://www.spinics.net/lists/xdp-newbies/msg00320.html
https://www.spinics.net/lists/xdp-newbies/msg00323.html

  This patch adds the initial BPF AsmParser support in LLVM.

  This support is following the "verifier instruction format" which is C-like.

  Things need to mention are:

1. LLVM AsmParser doesn't expect the character "*" to be used as the start
   of a statement while BPF "verifier instruction format" is.  So an new
   generic method "starIsStartOfStatement" is introduced.

2. As the MC assembler is sharing code infrastructure with disassembler, the
   current supported instruction format is *strictly* those registered in
   instruction information .td files.  For example, the parser doesn't
   acccept "*(u8 *)(r0) = r7", instead, you need to write
   "*(u8 *)(r0 + 0) = r7". The offset is mandatory, even when it is zero.
   The instruction information .td files may need to register customized
   ParserMethods to allow more flexible syntaxes.

  The testcase bpf-insn-unit.s contains unit tests for supported syntaxes.

  Comments? Does the current accepted syntaxes look OK?

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 include/llvm/MC/MCParser/MCTargetAsmParser.h |   2 +
 lib/MC/MCParser/AsmParser.cpp|   5 +
 lib/Target/BPF/AsmParser/BPFAsmParser.cpp| 472 +++
 lib/Target/BPF/AsmParser/CMakeLists.txt  |   3 +
 lib/Target/BPF/AsmParser/LLVMBuild.txt   |  23 ++
 lib/Target/BPF/CMakeLists.txt|   1 +
 lib/Target/BPF/LLVMBuild.txt |   2 +-
 test/MC/BPF/bpf-insn-unit.s  | 112 +++
 test/MC/BPF/lit.local.cfg|   3 +
 9 files changed, 622 insertions(+), 1 deletion(-)
 create mode 100644 lib/Target/BPF/AsmParser/BPFAsmParser.cpp
 create mode 100644 lib/Target/BPF/AsmParser/CMakeLists.txt
 create mode 100644 lib/Target/BPF/AsmParser/LLVMBuild.txt
 create mode 100644 test/MC/BPF/bpf-insn-unit.s
 create mode 100644 test/MC/BPF/lit.local.cfg

diff --git a/include/llvm/MC/MCParser/MCTargetAsmParser.h b/include/llvm/MC/MCParser/MCTargetAsmParser.h
index 6d76d51..e5d5a2a 100644
--- a/include/llvm/MC/MCParser/MCTargetAsmParser.h
+++ b/include/llvm/MC/MCParser/MCTargetAsmParser.h
@@ -266,6 +266,8 @@ public:
   virtual bool equalIsAsmAssignment() { return true; };
   // Return whether this start of statement identifier is a label
   virtual bool isLabel(AsmToken ) { return true; };
+  // Return whether this parser accept star as start of statement
+  virtual bool starIsStartOfStatement() { return false; };
 
   virtual const MCExpr *applyModifierToExpr(const MCExpr *E,
 MCSymbolRefExpr::VariantKind,
diff --git a/lib/MC/MCParser/AsmParser.cpp b/lib/MC/MCParser/AsmParser.cpp
index 03858d3..ee3cc8d 100644
--- a/lib/MC/MCParser/AsmParser.cpp
+++ b/lib/MC/MCParser/AsmParser.cpp
@@ -1687,6 +1687,11 @@ bool AsmParser::parseStatement(ParseStatementInfo ,
 // Treat '}' as a valid identifier in this context.
 Lex();
 IDVal = "}";
+  } else if (Lexer.is(AsmToken::Star) &&
+ getTargetParser().starIsStartOfStatement()) {
+// Accept '*' as a valid start of statement.
+Lex();
+IDVal = "*";
   } else if (parseIdentifier(IDVal)) {
 if (!TheCondState.Ignore) {
   Lex(); // always eat a token
diff --git a/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
new file mode 100644
index 000..0db8c26
--- /dev/null
+++ b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -0,0 +1,472 @@
+//===-- BPFAsmParser.cpp - Parse BPF assembly to MCInst instructions --===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringSwitch.h"
+#include "llvm/MC/MCContext.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCParser/MCAsmLexer.h"
+#include "llvm/MC/MCParser/MCParsedAsmOperand.h"
+#include "llvm/MC/MCParser/MCTargetAsmParser.h"
+#include "llvm/MC/MCRegisterInfo.h"
+#include "llvm/MC/MCStreamer.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/TargetRegistry.h"
+
+using namespace llvm;
+
+namespace {
+struct BPFOperand;
+
+class BPFAsmParser : public MCTargetAsmParser {
+  SMLoc getLoc() const { return getParser().getTok().getLoc(); }
+
+  bool MatchAndEmitInstruction(SMLoc IDLoc, unsign