Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-08 Thread max haughton via Digitalmars-d-learn

On Monday, 6 June 2022 at 15:13:45 UTC, rempas wrote:
I tried to find anything that will show code but I wasn't able 
to find anything expect for an answer on stackoverflow. I would 
find a lot of theory but no practical code that works. What I 
want to do is allocate memory (with execution mapping), add the 
machine instructions and then allocate another memory block for 
the data and finally, execute the block of memory that contains 
the code. So something like what the OS loader does when 
reading an executable. I have come with the following code:


If you know the instructions ahead of time LDC and GDC will both 
let you put a function in it's own section, and you can then use 
some linker magic to get pointers to the beginning and end of 
that section.

Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-08 Thread rempas via Digitalmars-d-learn

On Monday, 6 June 2022 at 15:13:45 UTC, rempas wrote:

In case someone is wondering, I found an answer in another
forum. The code is the following:

import core.stdc.stdio;
import core.stdc.string;
import core.stdc.stdlib;
import core.sys.posix.sys.mman;

void putbytes(char **code, const char *bytes) {
  uint bt;
  for (int i = 0, n; sscanf(bytes + i, "%x%n", , ) == 1; i 
+= n)

{ *(*code)++ = cast(char)bt; }

void putdata(char **code, char** data) {
  memcpy(*code, data, (*data).sizeof);
  *code += (*data).sizeof;

extern (C) void main() {
  char *data = cast(char*)mmap(null, cast(ulong)15, PROT_READ | 

  strcpy(data, "Hello world!\n");

  char *code = cast(char*)mmap(null, cast(ulong)500, PROT_READ | 

  char *pos = code;

  // Call the "write" and "exit" system calls
  putbytes(, "48 C7 C0 1 0 0 0");// mov rax, 0x01 
(write syscall)
  putbytes(, "48 C7 C7 1 0 0 0");// mov rdi, 0x01 
  putbytes(, "48 C7 C2 D 0 0 0");   // mov rdx, 13   
(string length)
  putbytes(, "48 BE");  // movabs rsi, 
data  (string address)

  putdata(, );
  putbytes(, "0F 05");// syscall
  putbytes(, "48 C7 C0 3C 0 0 0");  // mov rax, 0x3C 
(exit syscall)

  putbytes(, "0F 05");   // syscall

  // Execute the code
  (cast(void* function())code)();

Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread rempas via Digitalmars-d-learn

On Monday, 6 June 2022 at 18:05:23 UTC, Johan wrote:
This instruction is wrong. Note that you are writing twice to 
RDX, but also that you are using `mov sign_extend imm32, reg64` 
instead of `mov imm64, reg64` (`0x48 0xBA`?). Third, why append 
an extra zero (`*cast(char*)(code + 32) = 0x00;`)? That must be 
a bug too.


Thanks! It seems that there is probably a "typo" from the 
original [source]( 
that I got the code. The hex values are different however so 
there is only a mistake in the comment, the code normally works 
in the example repository (and I made a D version that works 
too). The padding in the end seems to be necessary else the 
example doesn't compile (I don't know why, I'm SUPER n00b when it 
comes to machine language, I don't know almost anything!). I'm 
also not sure how the "encode" will be for `mov imm64, reg64` as 
I tried to type what you typed in the parenthesis and it doesn't 
seem to work.

Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread Johan via Digitalmars-d-learn

On Monday, 6 June 2022 at 15:13:45 UTC, rempas wrote:

  // mov rdx, 
  *cast(char*)(code + 14) = 0x48;
  *cast(char*)(code + 15) = 0xC7;
  *cast(char*)(code + 16) = 0xC2;
  *cast(char*)(code + 17) = 12;
  *cast(char*)(code + 18) = 0x00;
  *cast(char*)(code + 19) = 0x00;
  *cast(char*)(code + 20) = 0x00;

  // mov rdx, 
  *cast(char*)(code + 21) = 0x48;
  *cast(char*)(code + 22) = 0xC7;
  *cast(char*)(code + 23) = 0xC1;
  *cast(long*)(code + 24) = cast(long)data;
  *cast(char*)(code + 32) = 0x00;

This instruction is wrong. Note that you are writing twice to 
RDX, but also that you are using `mov sign_extend imm32, reg64` 
instead of `mov imm64, reg64` (`0x48 0xBA`?). Third, why append 
an extra zero (`*cast(char*)(code + 32) = 0x00;`)? That must be a 
bug too.


Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread rempas via Digitalmars-d-learn

On Monday, 6 June 2022 at 16:24:58 UTC, Guillaume Piolat wrote:


Thank you! And I just noticed that the second source is from 

Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread rempas via Digitalmars-d-learn

On Monday, 6 June 2022 at 16:08:28 UTC, Adam D Ruppe wrote:

On a lot of systems, it can't be executable and writable at the 
same time, it is a security measure.


so you might have to mprotect it to remove the write permission 
before trying to execute it.

idk though

Thank you! This was very helpful and I can see why it is a clever 
idea to not allow it (and I love that OpenBSD was the first 
introducing it!!) and I love security stuff ;)

However, even with "mprotect" or If I just use "PROT_READ" and 
"PROT_EXEC", it still doesn't work so there should be something 
else I'm doing wrong...

Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread Guillaume Piolat via Digitalmars-d-learn

On Monday, 6 June 2022 at 15:13:45 UTC, rempas wrote:

Any ideas?


Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread Adam D Ruppe via Digitalmars-d-learn

On Monday, 6 June 2022 at 15:13:45 UTC, rempas wrote:
  void* code = mmap(null, cast(ulong)500, PROT_READ | 

On a lot of systems, it can't be executable and writable at the 
same time, it is a security measure.


so you might have to mprotect it to remove the write permission 
before trying to execute it.

idk though

Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread rempas via Digitalmars-d-learn

On Monday, 6 June 2022 at 15:27:12 UTC, Alain De Vos wrote:
Note , it is also possible to do inline assembly with asm{...}  
or __asm(T) {..}.

Thank you for the info! I am aware of that, I don't want to 
practically do this. I just want to learn how it works. It will 
be useful when I'll built my own OS.

Re: How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread Alain De Vos via Digitalmars-d-learn
Note , it is also possible to do inline assembly with asm{...}  
or __asm(T) {..}.

How to map machine instctions in memory and execute them? (Aka, how to create a loader)

2022-06-06 Thread rempas via Digitalmars-d-learn
I tried to find anything that will show code but I wasn't able to 
find anything expect for an answer on stackoverflow. I would find 
a lot of theory but no practical code that works. What I want to 
do is allocate memory (with execution mapping), add the machine 
instructions and then allocate another memory block for the data 
and finally, execute the block of memory that contains the code. 
So something like what the OS loader does when reading an 
executable. I have come with the following code:

import core.stdc.stdio;
import core.stdc.string;
import core.stdc.stdlib;
import core.sys.linux.sys.mman;

extern (C) void main() {
  char* data = cast(char*)mmap(null, cast(ulong)15, 

  memset(data, 0x0, 15); // Default value

  *data = 'H';
  data[1] = 'e';
  data[2] = 'l';
  data[3] = 'l';
  data[4] = 'o';
  data[5] = ' ';

  data[6] = 'w';
  data[7] = 'o';
  data[8] = 'r';
  data[9] = 'l';
  data[10] = 'd';
  data[11] = '!';

  void* code = mmap(null, cast(ulong)500, PROT_READ | PROT_WRITE 

  memset(code, 0xc3, 500); // Default value

  /* Call the "write" and "exit" system calls*/
  // mov rax, 0x04
  *cast(char*)code = 0x48;
  *cast(char*)(code + 1) = 0xC7;
  *cast(char*)(code + 2) = 0xC0;
  *cast(char*)(code + 3) = 0x04;
  *cast(char*)(code + 4) = 0x00;
  *cast(char*)(code + 5) = 0x00;
  *cast(char*)(code + 6) = 0x00;

  // mov rbx, 0x01
  *cast(char*)(code + 7)  = 0x48;
  *cast(char*)(code + 8)  = 0xC7;
  *cast(char*)(code + 9)  = 0xC3;
  *cast(char*)(code + 10) = 0x01;
  *cast(char*)(code + 11) = 0x00;
  *cast(char*)(code + 12) = 0x00;
  *cast(char*)(code + 13) = 0x00;

  // mov rdx, 
  *cast(char*)(code + 14) = 0x48;
  *cast(char*)(code + 15) = 0xC7;
  *cast(char*)(code + 16) = 0xC2;
  *cast(char*)(code + 17) = 12;
  *cast(char*)(code + 18) = 0x00;
  *cast(char*)(code + 19) = 0x00;
  *cast(char*)(code + 20) = 0x00;

  // mov rdx, 
  *cast(char*)(code + 21) = 0x48;
  *cast(char*)(code + 22) = 0xC7;
  *cast(char*)(code + 23) = 0xC1;
  *cast(long*)(code + 24) = cast(long)data;
  *cast(char*)(code + 32) = 0x00;

  // int 0x80
  *cast(char*)(code + 33) = 0xcd;
  *cast(char*)(code + 34) = 0x80;

  /* Execute the code */
  (cast(void* function()))();

I'm 100% sure that the instructions work as I have tested them 
with another example that creates an ELF executable file and it 
was able to execute correctly. So unless I copy-pasted them 
wrong, the instructions are not the problem. The only thing that 
may be wrong is when I'm getting the location of the "data" 
"segment". In my eyes, this uses 8 bytes for the memory address 
(I'm in a 64bit machine) and it takes the memory address the 
"data" variable holds so I would expect it to work

Any ideas?