Re: [iovisor-dev] minutes: IO Visor TSC/Dev Meeting

2018-05-17 Thread Jiong Wang via iovisor-dev

On 17/05/2018 05:38, Brenden Blanco via iovisor-dev wrote:

Hi All,

Thanks for attending the meeting today. As usual, here are my notes.

=== Discussion ===
Bjorn
- AF_XDP zero copy implementation RFC on netdev
- some uapi concerns, may need to back out changes
- but, over next couple weeks before merge window, discuss and improve in
  hopes of merging
- clean up non-intel code, respin, leave out i40e patches at the end
- separate i40e cleanup spin
- bcm may try to join as 2nd vendor to flesh out uapi
- some discussion of common code organization
- xdp in separate file helps backporting

Yonghong
- some python3 fixes
- proposals for improved container support
- rewriter complexity increasing
- tracing event introspection

Martin
- BTF id landed
- working on bpftool support
- still using pahole to convert dwarf
- still unclear how integration with llvm will look

Jiong
- dominator tree info RFCs pushed
- significant mem/exec time costs to run (e.g. 2x more, 3x slower)


Just for the record, if I listened correctly, John proposed to introduce
a switch for the new CFG infrastructure, i.e, we keep the exisiting DFS
based check_cfg, the new CFG code will only be enabled when check_cfg
detected a loop and wants know whether it is bounded, by this we could
avoid extra mem/exec time cost on programs that don't have loop.

Regards,
Jiong


John
- induction variable tracking in progress
- next: min/max value tracking
both: to work on topic branch under bpf-next

William
- tried workaround from Yonghong
- still need to dig into root cause

Joe
- implementation of sock lookup going through iterations
- concerns over locking in bpf being a leaky implementation detail


=== Attendees ===
Brenden Blanco
Alexei Starovoitov
Bjorn Topel
Daniel Borkmann
Jesper Brouer
Jiong Wang
Magnus Karlsson
Quentin Monnet
Joe Stringer
Andy Gospodarek
Ferris
Brendan Gregg
Jakub Kicinski
William Tu
Yonghong Song
Alexander Duyck
Sandipan
John F
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [PATCH, bpf-llvm] Fix bug on silently truncating 64-bit immediate

2017-10-12 Thread Jiong Wang via iovisor-dev

On 12/10/2017 12:04, Jesper Dangaard Brouer wrote:

On Thu, 12 Oct 2017 06:21:30 -0400
Jiong Wang  wrote:


We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding.  This caused comparison with wrong value.

I think you send this to the wrong mailing list... this looks like a
patch against the LLVM source code.

Shouldn't you send to: llvm-...@lists.llvm.org ?


Hi Jesper,

  I am thinking the users & developers of eBPF llvm are quite focused in
xdp-newbies and iovisor-dev, so if the code is related with user visible
effect, for example bug fix, and if the code change is straightforward
and small, I tend to just send it to these two so related user could be
aware of it in case they are encountering the same problem, and then this
patch could be posted to llvm review site (https://reviews.llvm.org).

  

This bug looks to be introduced by r308080.

This looks like a very recent change:
   https://llvm.org/viewvc/llvm-project?view=revision=308080
   Sat Jul 15 05:41:42 2017 UTC (2 months, 4 weeks ago)

(As you are sending this to a user mailing list: xdp-newb...@vger.kernel.org)

I want to know if this have made it into a LLVM release? and which release?


This has not been made into any LLVM official release.
The bug may show up if the user is trying to use distributor snapshop release.



As the git-repo[1] replica of LLVM SVN-repo does not git-tag the
releases, I could not answer this question myself via the command:

$ git describe --contains 7c423e0690

[1] http://llvm.org/git/llvm.git


The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.

Reported-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
  lib/Target/BPF/BPFISelLowering.cpp |  8 ++--
  lib/Target/BPF/BPFInstrInfo.td |  2 +-
  test/CodeGen/BPF/select_ri.ll  | 35 +++
  3 files changed, 42 insertions(+), 3 deletions(-)


--

Regards,
Jiong

___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


[iovisor-dev] [PATCH, bpf-llvm] Fix bug on silently truncating 64-bit immediate

2017-10-12 Thread Jiong Wang via iovisor-dev
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding.  This caused comparison with wrong value.

This bug looks to be introduced by r308080.  The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.

Reported-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 lib/Target/BPF/BPFISelLowering.cpp |  8 ++--
 lib/Target/BPF/BPFInstrInfo.td |  2 +-
 test/CodeGen/BPF/select_ri.ll  | 35 +++
 3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/lib/Target/BPF/BPFISelLowering.cpp 
b/lib/Target/BPF/BPFISelLowering.cpp
index d4e06dd..995f206 100644
--- a/lib/Target/BPF/BPFISelLowering.cpp
+++ b/lib/Target/BPF/BPFISelLowering.cpp
@@ -611,11 +611,15 @@ 
BPFTargetLowering::EmitInstrWithCustomInserter(MachineInstr ,
 .addReg(LHS)
 .addReg(MI.getOperand(2).getReg())
 .addMBB(Copy1MBB);
-  else
+  else {
+int64_t imm32 = MI.getOperand(2).getImm();
+// sanity check before we build J*_ri instruction.
+assert (isInt<32>(imm32));
 BuildMI(BB, DL, TII.get(NewCC))
 .addReg(LHS)
-.addImm(MI.getOperand(2).getImm())
+.addImm(imm32)
 .addMBB(Copy1MBB);
+  }
 
   // Copy0MBB:
   //  %FalseValue = ...
diff --git a/lib/Target/BPF/BPFInstrInfo.td b/lib/Target/BPF/BPFInstrInfo.td
index fcd6a60..a3ad2ee 100644
--- a/lib/Target/BPF/BPFInstrInfo.td
+++ b/lib/Target/BPF/BPFInstrInfo.td
@@ -460,7 +460,7 @@ let usesCustomInserter = 1 in {
   (ins GPR:$lhs, i64imm:$rhs, i64imm:$imm, GPR:$src, 
GPR:$src2),
   "# Select PSEUDO $dst = $lhs $imm $rhs ? $src : $src2",
   [(set i64:$dst,
-   (BPFselectcc i64:$lhs, (i64 imm:$rhs), (i64 imm:$imm), 
i64:$src, i64:$src2))]>;
+   (BPFselectcc i64:$lhs, (i64immSExt32:$rhs), (i64 
imm:$imm), i64:$src, i64:$src2))]>;
 }
 
 // load 64-bit global addr into register
diff --git a/test/CodeGen/BPF/select_ri.ll b/test/CodeGen/BPF/select_ri.ll
index b802b64..7b1f852 100644
--- a/test/CodeGen/BPF/select_ri.ll
+++ b/test/CodeGen/BPF/select_ri.ll
@@ -25,3 +25,38 @@ entry:
 }
 
 attributes #0 = { norecurse nounwind readonly }
+
+; test immediate out of 32-bit range
+; Source file:
+
+; unsigned long long
+; load_word(void *buf, unsigned long long off)
+; asm("llvm.bpf.load.word");
+;
+; int
+; foo(void *buf)
+; {
+;  unsigned long long sum = 0;
+;
+;  sum += load_word(buf, 100);
+;  sum += load_word(buf, 104);
+;
+;  if (sum != 0x1ULL)
+;return ~0U;
+;
+;  return 0;
+;}
+
+; Function Attrs: nounwind readonly
+define i32 @foo(i8*) local_unnamed_addr #0 {
+  %2 = tail call i64 @llvm.bpf.load.word(i8* %0, i64 100)
+  %3 = tail call i64 @llvm.bpf.load.word(i8* %0, i64 104)
+  %4 = add i64 %3, %2
+  %5 = icmp ne i64 %4, 8589934591
+; CHECK:  r{{[0-9]+}} = 8589934591 ll
+  %6 = sext i1 %5 to i32
+  ret i32 %6
+}
+
+; Function Attrs: nounwind readonly
+declare i64 @llvm.bpf.load.word(i8*, i64) #1
-- 
2.7.4

___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support

2017-09-19 Thread Jiong Wang via iovisor-dev

On 18/09/2017 22:29, Daniel Borkmann wrote:

On 09/18/2017 10:47 PM, Jiong Wang wrote:

Hi,

   Currently, LLVM eBPF backend always generate code in 64-bit mode, 
this may

cause troubles when JITing to 32-bit targets.

   For example, it is quite common for XDP eBPF program to access 
some packet
fields through base + offset that the default eBPF will generate 
BPF_ALU64 for
the address formation, later when JITing to 32-bit hardware, 
BPF_ALU64 needs
to be expanded into 32 bit ALU sequences even though the address 
space is

32-bit that the high bits is not significant.

   While a complete 32-bit mode implemention may need an new ABI 
(something like
-target-abi=ilp32), this patch set first add some initial code so we 
could

construct 32-bit eBPF tests through hand-written assembly.

   A new 32-bit register set is introduced, its name is with "w" 
prefix and LLVM
assembler will encode statements like "w1 += w2" into the following 
8-bit code

field:

 BPF_ADD | BPF_X | BPF_ALU

BPF_ALU will be used instead of BPF_ALU64.

   NOTE, currently you can only use "w" register with ALU statements, 
not with
others like branches etc as they don't have different encoding for 
32-bit

target.


Great to see work in this direction! Can we also enable to use / emit
all the 32bit BPF_ALU instructions whenever possible for the currently
available bpf targets while at it (which only use BPF_ALU64 right now)?


Hi Daniel,

   Thanks for the feedback.

   I think we could also enable the use of all the 32bit BPF_ALU under 
currently
available bpf targets.  As we now have 32bit register set support, we 
could make
i32 type as legal type to prevent it be promoted into i64, then hook it 
up with i32

ALU patterns, will look into this.

Regards,
Jiong
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [PATCH RFC 3/4] New 32-bit register set

2017-09-19 Thread Jiong Wang via iovisor-dev

On 19/09/2017 07:44, Y Song wrote:

Hi, Jiong,

Thanks for the patch! It is a great start to support 32bit register in BPF.
In the past, I have studied a little bit to see whether 32bit register
support may reduce
the number of unnecessary shifts on x86_64 and improve the
performance. Looking through
a few bpf programs and it looks like the opportunity is not great, but
still nice to have if we
have this capability. As you mentioned, this definitely helped 32bit
architecture!

I am not an expert in LLVM tablegen. I briefly looked through the code
change and it looks like
correct. Hopefully some llvm-dev tablegen experts can comment on the
implementation.

Below I only have a couple of minor comments.


Yong Hong,

  Thanks for the review.

  I have addressed your comments and attached the updated patch.

  Do you want me to put this patch set on to llvm review website? I 
guess it is the

  formal review channel?

Regards,
Jiong

Acked-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 

diff --git a/lib/Target/BPF/BPFRegisterInfo.td 
b/lib/Target/BPF/BPFRegisterInfo.td
index c8e24f8..da1d6b5 100644
--- a/lib/Target/BPF/BPFRegisterInfo.td
+++ b/lib/Target/BPF/BPFRegisterInfo.td
@@ -11,31 +11,42 @@
 //  Declarations that describe the BPF register file
 
//===--===//
 
+let Namespace = "BPF" in {
+  def sub_32 : SubRegIndex<32>;
+}
+
+class Wi Enc, string n> : Register {
+  let HWEncoding = Enc;
+  let Namespace = "BPF";
+}
+
 // Registers are identified with 4-bit ID numbers.
 // Ri - 64-bit integer registers
-class Ri Enc, string n> : Register {
-  let Namespace = "BPF";
+class Ri Enc, string n, list subregs>
+  : RegisterWithSubRegs {
   let HWEncoding = Enc;
+  let Namespace = "BPF";
+  let SubRegIndices = [sub_32];
 }
 
-// Integer registers
-def R0 : Ri< 0, "r0">, DwarfRegNum<[0]>;
-def R1 : Ri< 1, "r1">, DwarfRegNum<[1]>;
-def R2 : Ri< 2, "r2">, DwarfRegNum<[2]>;
-def R3 : Ri< 3, "r3">, DwarfRegNum<[3]>;
-def R4 : Ri< 4, "r4">, DwarfRegNum<[4]>;
-def R5 : Ri< 5, "r5">, DwarfRegNum<[5]>;
-def R6 : Ri< 6, "r6">, DwarfRegNum<[6]>;
-def R7 : Ri< 7, "r7">, DwarfRegNum<[7]>;
-def R8 : Ri< 8, "r8">, DwarfRegNum<[8]>;
-def R9 : Ri< 9, "r9">, DwarfRegNum<[9]>;
-def R10 : Ri<10, "r10">, DwarfRegNum<[10]>;
-def R11 : Ri<11, "r11">, DwarfRegNum<[11]>;
+foreach I = 0-11 in {
+  // 32-bit Integer (alias to low part of 64-bit register).
+  def W#I  : Wi,  DwarfRegNum<[I]>;
+  // 64-bit Integer registers
+  def R#I  : Ri,  DwarfRegNum<[I]>;
+}
 
 // Register classes.
-def GPR : RegisterClass<"BPF", [i64], 64, (add R1, R2, R3, R4, R5,
-   R6, R7, R8, R9, // callee saved
-   R0,  // return value
-   R11, // stack ptr
-   R10  // frame ptr
-  )>;
+def GPR32 : RegisterClass<"BPF", [i32], 32, (add
+  (sequence "W%u", 1, 9),
+  W0, // Return value
+  W11, // Stack Ptr
+  W10  // Frame Ptr
+)>;
+
+def GPR : RegisterClass<"BPF", [i64], 64, (add
+  (sequence "R%u", 1, 9),
+  R0, // Return value
+  R11, // Stack Ptr
+  R10  // Frame Ptr
+)>;
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [PATCH RFC 0/4] Initial 32-bit eBPF encoding support

2017-09-19 Thread Jiong Wang via iovisor-dev

On 19/09/2017 05:49, Fulvio Risso wrote:

Dear Jiong,

that's a great work.
I havent' gone through the whole patches, but it seems to me that the 
documentation is not that much.
From my past experiences, putting your hands into a compiler without 
at least some high-level documentation that presents how it works, it 
would be a nightmare.


Hi Fulvio,

  Thanks for the feedback.

  I agree that we need some high-level documentation on this. It could let
people evaluating the implementation approach taken and we could make sure
things on on correct direction.

  I was dividing the 32-bit BPF support into the following three parts and
I am thinking the support of the first part could let the user be able to
construct testcases contaning 32-bit ALU which is good for experimenting
and it won't affect code-gen from .c/.ll. Aslo this part is not hard for
review and relatively independent, so I have seperated them out into this
patch set for review.

  Detailed implementation details or high-level design will be included in
the cover letter of patch set for the second part which is the actually
part for 32-bit code-gen enablement.

brief intro on 32-bit BPF support phases
===
* Support in assembly level.

  This includes cleanup of existing BPF instruction patterns to make them
  32-bit support friendly and also the registration of 32-bit register set.

* Full code-gen Support in LLVM.

  This includes making 32-bit integer type as legal type and associate it
  with 32-bit register class, also we need to cleanup operation legalization
  stragegies.

  We need to discuss whether we want to make 32-bit BPF an new target or
  just an new environment in LLVM's concept, so the user could simply enable
  it through something like --triple=bpf-unknown-elf-ilp32. We may also need
  new ELF attributes and new relocation types for map address relocation for
  32-bit eBPF.

* Hook Clang driver with LLVM.

  Support for this is straightforward, especially if we go with new 
environment
  approach.

Regards,

Jiong

___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [v2][PATCH RFC] Add BPF AsmParser support in LLVM

2017-09-12 Thread Jiong Wang via iovisor-dev

On Tue, Sep 12, 2017 at 11:10 AM, Y Song  wrote:

Thanks, Jiong,

The patch looks good. I have applied to llvm trunk.
https://reviews.llvm.org/rL313055


Thanks, Yonghong.




Hmm, I actually feel there is no need to offer an separate syntax for
64bit imm
encoding, it might be better to offer a unified single syntax "r =
signed_imm"
to the user and the encoder (BPFMCCodeEmitter::encodeInstruction)
guarantee to
choose optimal encoding.

Encoder could choose BPF_ALU64 | BPF_MOV | BPF_K for both "r1 = -1" and
"r1 = -1ll" while only resorting to BPF_LD | BPF_DW | BPF_IMM when the imm
really
doesn't fit into 32bit (!isInt<32>(imm_opnd)), for example, "r1 =
0x1ll".


Right now, this optimization actually
has been done in compiler side. So depending on the value, the compiler will
use
"move r1, <32_bit_value>" or "ld_imm64 r1, ".
In this case, it make sense to have different insn formats for two different
kinds of
insns.

What you described is to move the optimization down to MC Emit Code stage,
based on
the value, it can choose to use "move" or "ld_imm64" insn. So optimization
will be available
for .c => .o and for .s => .o. (The current scheme, optimization not
available from .s => .o).

Yes, it is possible. But typically, there is one-to-one mapping between asm
insn to insn encoding.
If later on, we found the complain about this, we can revisit this issue.


OK.

Regards,
Jiong

___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [v2][PATCH RFC] Add BPF AsmParser support in LLVM

2017-09-12 Thread Jiong Wang via iovisor-dev



- Adjusted constant load tests.
- Interleaved the expected results with input insns.

  * Changed the disassembly output "ll" to "LL" in BPFInstPrinter.cpp.
This is because I noticed MC parser only accept "LL" as long long
suffix.


Sorry for the confusion caused by "ll". I tested your patch and found that
both r = 5 and r = 5LL generated ld_imm64 insn. This is not what compiler
will do. Further debugging I found that this is due to my change to remove
"ll" from asmstring in the insn definition (.td file). Since the assembler is
patten matching based on asmstring. "r = 5" matched to ld_imm64 as well.

I just fixed the issue and restored " ll" suffix for ld_imm64 asmstring.
You can then restore your previous change to add "ll" as a valid identifier
in the middle of asm statement.


Hmm, I actually feel there is no need to offer an separate syntax for 64bit imm
encoding, it might be better to offer a unified single syntax "r = signed_imm"
to the user and the encoder (BPFMCCodeEmitter::encodeInstruction) guarantee to
choose optimal encoding.

Encoder could choose BPF_ALU64 | BPF_MOV | BPF_K for both "r1 = -1" and
"r1 = -1ll" while only resorting to BPF_LD | BPF_DW | BPF_IMM when the imm 
really
doesn't fit into 32bit (!isInt<32>(imm_opnd)), for example, "r1 = 
0x1ll".

Anyway, patch updated, changes are:

  - added "ll" back.
  - updated the test cases for BPF_MOV | BPF_K in ALU64 class and
BPF_IMM | BPF_DW in LD class.
  - Dropped the change from "ll" to "LL". If we decided to move to "LL" which is
in consistent with LLVM generic parser then could clean this up as a
separate patch.

Please review, thanks.

Regards,
Jiong

Reviewed-by: Yonghong Song 
Signed-off-by: Jiong Wang 

---
 include/llvm/MC/MCParser/MCTargetAsmParser.h |   2 +
 lib/MC/MCParser/AsmParser.cpp|   5 +
 lib/Target/BPF/AsmParser/BPFAsmParser.cpp| 472 +++
 lib/Target/BPF/AsmParser/CMakeLists.txt  |   3 +
 lib/Target/BPF/AsmParser/LLVMBuild.txt   |  23 ++
 lib/Target/BPF/CMakeLists.txt|   1 +
 lib/Target/BPF/LLVMBuild.txt |   2 +-
 test/MC/BPF/insn-unit.s  | 168 ++
 test/MC/BPF/lit.local.cfg|   3 +
 9 files changed, 678 insertions(+), 1 deletion(-)
 create mode 100644 lib/Target/BPF/AsmParser/BPFAsmParser.cpp
 create mode 100644 lib/Target/BPF/AsmParser/CMakeLists.txt
 create mode 100644 lib/Target/BPF/AsmParser/LLVMBuild.txt
 create mode 100644 test/MC/BPF/insn-unit.s
 create mode 100644 test/MC/BPF/lit.local.cfg

diff --git a/include/llvm/MC/MCParser/MCTargetAsmParser.h b/include/llvm/MC/MCParser/MCTargetAsmParser.h
index 6d76d51..e5d5a2a 100644
--- a/include/llvm/MC/MCParser/MCTargetAsmParser.h
+++ b/include/llvm/MC/MCParser/MCTargetAsmParser.h
@@ -266,6 +266,8 @@ public:
   virtual bool equalIsAsmAssignment() { return true; };
   // Return whether this start of statement identifier is a label
   virtual bool isLabel(AsmToken ) { return true; };
+  // Return whether this parser accept star as start of statement
+  virtual bool starIsStartOfStatement() { return false; };
 
   virtual const MCExpr *applyModifierToExpr(const MCExpr *E,
 MCSymbolRefExpr::VariantKind,
diff --git a/lib/MC/MCParser/AsmParser.cpp b/lib/MC/MCParser/AsmParser.cpp
index 03858d3..ee3cc8d 100644
--- a/lib/MC/MCParser/AsmParser.cpp
+++ b/lib/MC/MCParser/AsmParser.cpp
@@ -1687,6 +1687,11 @@ bool AsmParser::parseStatement(ParseStatementInfo ,
 // Treat '}' as a valid identifier in this context.
 Lex();
 IDVal = "}";
+  } else if (Lexer.is(AsmToken::Star) &&
+ getTargetParser().starIsStartOfStatement()) {
+// Accept '*' as a valid start of statement.
+Lex();
+IDVal = "*";
   } else if (parseIdentifier(IDVal)) {
 if (!TheCondState.Ignore) {
   Lex(); // always eat a token
diff --git a/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
new file mode 100644
index 000..d00200c
--- /dev/null
+++ b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -0,0 +1,472 @@
+//===-- BPFAsmParser.cpp - Parse BPF assembly to MCInst instructions --===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringSwitch.h"
+#include "llvm/MC/MCContext.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCParser/MCAsmLexer.h"
+#include "llvm/MC/MCParser/MCParsedAsmOperand.h"
+#include "llvm/MC/MCParser/MCTargetAsmParser.h"
+#include "llvm/MC/MCRegisterInfo.h"
+#include "llvm/MC/MCStreamer.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include 

[iovisor-dev] [v2][PATCH RFC] Add BPF AsmParser support in LLVM

2017-09-11 Thread Jiong Wang via iovisor-dev

On 09/09/17 00:47, Y Song wrote:

On Fri, Sep 8, 2017 at 1:00 PM, Jiong Wang  wrote:


Hi Y Song,

  Thanks for the review and test.

[snip]


I guess, we may need to add something like LD_VAR_ADDR so that
the proper assembly code can be issued. Also we could drop "ll" with
"r1 = tx_port"?



Personally, I prefer drop "ll". The "ll" suffix was there to tell people it is 
64bit
constant while the register "r1" is with 64-bit width so I feel it is enough. 
For
32-bit situation, we need new register set, some thing like "w1 = tx_port".


I just push a patch (similar to your suggestion below but without
assertion) which still has
"ll" suffix for the constant, but no "ll" suffix for symbols. The
reason is that  we use "ll"
in the asm printout to differentiate "r = 5"=>"mov r, 5" (32bit imm) and
"r = 5ll" => "ld_imm64 r, 5ll" (64bit imm).

Also, for long long constant, C standard does not allow space between
value and "ll" and
hence "567 ll" is not allowed to represent a number "567ll". Could you
try to disallow
64bit immediate like "567 ll" as well in your patch?


Hi Y Song,

Thanks, patch updated, please review.

Changes since the initial version:

  * Addressed all comments on instruction unit tests.
- Renamed test file to "insn-unit.s".
- Removed unnecessary comments.
- Added BPF_B tests for load and store.
- Added BPF_K test for jmp and alu.
  (Noticed a seperate issue with unsigned compare then jump. Looks like BPF
   backend is alway printing the immediate as signed, I guess we need to
   change the operand types in instruction patterns.)
- Adjusted constant load tests.
- Interleaved the expected results with input insns.

  * Changed the disassembly output "ll" to "LL" in BPFInstPrinter.cpp.
This is because I noticed MC parser only accept "LL" as long long suffix.
It might be the reason documented here:

  
https://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=19759250

Reviewed-by: Yonghong Song 
Signed-off-by: Jiong Wang 

---
 include/llvm/MC/MCParser/MCTargetAsmParser.h  |   2 +
 lib/MC/MCParser/AsmParser.cpp |   5 +
 lib/Target/BPF/AsmParser/BPFAsmParser.cpp | 471 ++
 lib/Target/BPF/AsmParser/CMakeLists.txt   |   3 +
 lib/Target/BPF/AsmParser/LLVMBuild.txt|  23 ++
 lib/Target/BPF/CMakeLists.txt |   1 +
 lib/Target/BPF/InstPrinter/BPFInstPrinter.cpp |   2 +-
 lib/Target/BPF/LLVMBuild.txt  |   2 +-
 test/MC/BPF/insn-unit.s   | 164 +
 test/MC/BPF/lit.local.cfg |   3 +
 10 files changed, 674 insertions(+), 2 deletions(-)
 create mode 100644 lib/Target/BPF/AsmParser/BPFAsmParser.cpp
 create mode 100644 lib/Target/BPF/AsmParser/CMakeLists.txt
 create mode 100644 lib/Target/BPF/AsmParser/LLVMBuild.txt
 create mode 100644 test/MC/BPF/insn-unit.s
 create mode 100644 test/MC/BPF/lit.local.cfg

diff --git a/include/llvm/MC/MCParser/MCTargetAsmParser.h b/include/llvm/MC/MCParser/MCTargetAsmParser.h
index 6d76d51..e5d5a2a 100644
--- a/include/llvm/MC/MCParser/MCTargetAsmParser.h
+++ b/include/llvm/MC/MCParser/MCTargetAsmParser.h
@@ -266,6 +266,8 @@ public:
   virtual bool equalIsAsmAssignment() { return true; };
   // Return whether this start of statement identifier is a label
   virtual bool isLabel(AsmToken ) { return true; };
+  // Return whether this parser accept star as start of statement
+  virtual bool starIsStartOfStatement() { return false; };
 
   virtual const MCExpr *applyModifierToExpr(const MCExpr *E,
 MCSymbolRefExpr::VariantKind,
diff --git a/lib/MC/MCParser/AsmParser.cpp b/lib/MC/MCParser/AsmParser.cpp
index 03858d3..ee3cc8d 100644
--- a/lib/MC/MCParser/AsmParser.cpp
+++ b/lib/MC/MCParser/AsmParser.cpp
@@ -1687,6 +1687,11 @@ bool AsmParser::parseStatement(ParseStatementInfo ,
 // Treat '}' as a valid identifier in this context.
 Lex();
 IDVal = "}";
+  } else if (Lexer.is(AsmToken::Star) &&
+ getTargetParser().starIsStartOfStatement()) {
+// Accept '*' as a valid start of statement.
+Lex();
+IDVal = "*";
   } else if (parseIdentifier(IDVal)) {
 if (!TheCondState.Ignore) {
   Lex(); // always eat a token
diff --git a/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
new file mode 100644
index 000..a36d8b1
--- /dev/null
+++ b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -0,0 +1,471 @@
+//===-- BPFAsmParser.cpp - Parse BPF assembly to MCInst instructions --===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "llvm/ADT/STLExtras.h"

[iovisor-dev] [PATCH RFC] Add BPF AsmParser support in LLVM

2017-09-07 Thread Jiong Wang via iovisor-dev

Hi,

  As discussed at threads:

https://www.spinics.net/lists/xdp-newbies/msg00320.html
https://www.spinics.net/lists/xdp-newbies/msg00323.html

  This patch adds the initial BPF AsmParser support in LLVM.

  This support is following the "verifier instruction format" which is C-like.

  Things need to mention are:

1. LLVM AsmParser doesn't expect the character "*" to be used as the start
   of a statement while BPF "verifier instruction format" is.  So an new
   generic method "starIsStartOfStatement" is introduced.

2. As the MC assembler is sharing code infrastructure with disassembler, the
   current supported instruction format is *strictly* those registered in
   instruction information .td files.  For example, the parser doesn't
   acccept "*(u8 *)(r0) = r7", instead, you need to write
   "*(u8 *)(r0 + 0) = r7". The offset is mandatory, even when it is zero.
   The instruction information .td files may need to register customized
   ParserMethods to allow more flexible syntaxes.

  The testcase bpf-insn-unit.s contains unit tests for supported syntaxes.

  Comments? Does the current accepted syntaxes look OK?

Signed-off-by: Jiong Wang 
---
 include/llvm/MC/MCParser/MCTargetAsmParser.h |   2 +
 lib/MC/MCParser/AsmParser.cpp|   5 +
 lib/Target/BPF/AsmParser/BPFAsmParser.cpp| 472 +++
 lib/Target/BPF/AsmParser/CMakeLists.txt  |   3 +
 lib/Target/BPF/AsmParser/LLVMBuild.txt   |  23 ++
 lib/Target/BPF/CMakeLists.txt|   1 +
 lib/Target/BPF/LLVMBuild.txt |   2 +-
 test/MC/BPF/bpf-insn-unit.s  | 112 +++
 test/MC/BPF/lit.local.cfg|   3 +
 9 files changed, 622 insertions(+), 1 deletion(-)
 create mode 100644 lib/Target/BPF/AsmParser/BPFAsmParser.cpp
 create mode 100644 lib/Target/BPF/AsmParser/CMakeLists.txt
 create mode 100644 lib/Target/BPF/AsmParser/LLVMBuild.txt
 create mode 100644 test/MC/BPF/bpf-insn-unit.s
 create mode 100644 test/MC/BPF/lit.local.cfg

diff --git a/include/llvm/MC/MCParser/MCTargetAsmParser.h b/include/llvm/MC/MCParser/MCTargetAsmParser.h
index 6d76d51..e5d5a2a 100644
--- a/include/llvm/MC/MCParser/MCTargetAsmParser.h
+++ b/include/llvm/MC/MCParser/MCTargetAsmParser.h
@@ -266,6 +266,8 @@ public:
   virtual bool equalIsAsmAssignment() { return true; };
   // Return whether this start of statement identifier is a label
   virtual bool isLabel(AsmToken ) { return true; };
+  // Return whether this parser accept star as start of statement
+  virtual bool starIsStartOfStatement() { return false; };
 
   virtual const MCExpr *applyModifierToExpr(const MCExpr *E,
 MCSymbolRefExpr::VariantKind,
diff --git a/lib/MC/MCParser/AsmParser.cpp b/lib/MC/MCParser/AsmParser.cpp
index 03858d3..ee3cc8d 100644
--- a/lib/MC/MCParser/AsmParser.cpp
+++ b/lib/MC/MCParser/AsmParser.cpp
@@ -1687,6 +1687,11 @@ bool AsmParser::parseStatement(ParseStatementInfo ,
 // Treat '}' as a valid identifier in this context.
 Lex();
 IDVal = "}";
+  } else if (Lexer.is(AsmToken::Star) &&
+ getTargetParser().starIsStartOfStatement()) {
+// Accept '*' as a valid start of statement.
+Lex();
+IDVal = "*";
   } else if (parseIdentifier(IDVal)) {
 if (!TheCondState.Ignore) {
   Lex(); // always eat a token
diff --git a/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
new file mode 100644
index 000..0db8c26
--- /dev/null
+++ b/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -0,0 +1,472 @@
+//===-- BPFAsmParser.cpp - Parse BPF assembly to MCInst instructions --===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "MCTargetDesc/BPFMCTargetDesc.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringSwitch.h"
+#include "llvm/MC/MCContext.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCParser/MCAsmLexer.h"
+#include "llvm/MC/MCParser/MCParsedAsmOperand.h"
+#include "llvm/MC/MCParser/MCTargetAsmParser.h"
+#include "llvm/MC/MCRegisterInfo.h"
+#include "llvm/MC/MCStreamer.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/TargetRegistry.h"
+
+using namespace llvm;
+
+namespace {
+struct BPFOperand;
+
+class BPFAsmParser : public MCTargetAsmParser {
+  SMLoc getLoc() const { return getParser().getTok().getLoc(); }
+
+  bool MatchAndEmitInstruction(SMLoc IDLoc, unsigned ,
+   OperandVector , MCStreamer ,
+   uint64_t ,
+   bool MatchingInlineAsm) override;
+
+  bool ParseRegister(unsigned , SMLoc , SMLoc ) override;
+
+