I'd suggest reading the stream reader/write subtypes of 
AbstractSerializationStream to understand what all of the values are for - 
in short, a gwt-rpc response is a payload and a string table, and the 
payloads elements will reference the string table. You cannot know what the 
structure is for certain without seeing the original Java types being 
serialized, but often you can make good guesses.

I'd also suggest reading stackoverflow posts and the like showing how to 
deserialize other payloads just from context - here's a post that breaks 
down a payload to understand its contents: 
https://stackoverflow.com/questions/35047102/serializing-rpc-gwt/35047887#35047887

If you havent yet, read 
https://docs.google.com/document/d/1eG0YocsYYbNAtivkLtcaiEE5IOF5u4LUol8-LL0TIKU/edit
 
as well.

In short though, your response value is _probably_ be a List of 
CourseMember types - knowing that class will help you. I can't easily guess 
more though, as the above doc says, the json array is read backwards, so 
the important details would be right before and after the string array - 
you have 1,7,2,1[...strings...] in the second image. From that I can say

1: if this was zero, it would be a null, since it is a positive number, 
read the (value - 1) entry from the string table, which is ArrayList, so: 
read a value of type ArrayList from the stream 
7: the ArrayList has 7 items
2: first item in the arraylist - as above, if this was 0, it would be null, 
since it is positive, read the (value - 1) entry from the string table, and 
decode that type, so: read a CourseMember object from the payload
1: this is _probably_ the number 1 in the first field of the first 
CourseMember.
...

A parser continuing in this way, with knowledge of the structure of these 
types could be written to decode this payload. I don't know of an 
off-the-shelf tool that will do it for you in a truly automated way, but 
could consult to write one, or guide your project in implementing one by 
hand.

Since you're scraping anyway, consider just scraping the results of the 
rendered page? This will likely take substantially more CPU time, but 
ridiculously less developer time to implement. 

On Tuesday, October 8, 2024 at 12:03:30 PM UTC-5 [email protected] wrote:

> Thank you because the detail response.
>
> I want to crawl data on a public website, I opened devtools and saw that 
> it was written by GWT RPC.
>
> This is the body of request I saw: 
>
> 7|0|10|
> https://a.b.c.d/e|5C6CDB13D0FD25B266F3C36FA7FF6ED9|a1.a2.a3.DataService|getCourseMembers|java.lang.Long/4227064769|java.lang.String/2004016611|java.util.List|20204524|java.util.Arrays$ArrayList/2507071751|20241|1|2|3|4|3|5|6|7|5|TXbrzIAAA|8|9|1|6|10|
>
> As you can see, no problem with that syntax, I can understand roughly, I 
> know the method is getCourseMembers. I want to build a function should 
> return above body, like: 
>                   public static String getBodyEncoded(String methodName, 
> ... String methodBody ...) or something similar, and return the body above 
> to send to server.
>
> I also want to know the last past of request syntax:
>                  1|2|3|4|1|5|6|7|7|8|7|9|7|10|7|11|7|12|7|13|7|14|
>
> The next is the response body. This is really the problem. A response is 
> very long, I put it in attached files.
>
> I saw a JsonArray with more than 2000 elements, and I cannot understand 
> what are they. The only thing I understand is the 2042nd element, it 
> contains an unorder list. Maybe some elements before contains data about 
> the order.
>
> I want to build a method to extract/deserialize this response.
>
> I am a newbie, if my question can be completed, can you guide me with more 
> details, please?
> Java is good, but other languages are acceptable, I still can deploy it.
>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "GWT 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-web-toolkit/6f2eef0c-7b0a-4bc2-b6a3-c5ee5dbd760fn%40googlegroups.com.

Reply via email to