I second what Colin says "*Since you're scraping anyway, consider just 
scraping the results of the rendered page? This will likely take 
substantially more CPU time, but ridiculously less developer time to 
implement."*

GWT RPC is not an API.  It will constantly change as the website updates.

I'd recommend either using a proper API (if one exists), or an 
off-the-shelf scraper tool.

On Wednesday 9 October 2024 at 4:40:08 am UTC+11 Colin Alworth wrote:

> I'd suggest reading the stream reader/write subtypes of 
> AbstractSerializationStream to understand what all of the values are for - 
> in short, a gwt-rpc response is a payload and a string table, and the 
> payloads elements will reference the string table. You cannot know what the 
> structure is for certain without seeing the original Java types being 
> serialized, but often you can make good guesses.
>
> I'd also suggest reading stackoverflow posts and the like showing how to 
> deserialize other payloads just from context - here's a post that breaks 
> down a payload to understand its contents: 
> https://stackoverflow.com/questions/35047102/serializing-rpc-gwt/35047887#35047887
>
> If you havent yet, read 
> https://docs.google.com/document/d/1eG0YocsYYbNAtivkLtcaiEE5IOF5u4LUol8-LL0TIKU/edit
>  
> as well.
>
> In short though, your response value is _probably_ be a List of 
> CourseMember types - knowing that class will help you. I can't easily guess 
> more though, as the above doc says, the json array is read backwards, so 
> the important details would be right before and after the string array - 
> you have 1,7,2,1[...strings...] in the second image. From that I can say
>
> 1: if this was zero, it would be a null, since it is a positive number, 
> read the (value - 1) entry from the string table, which is ArrayList, so: 
> read a value of type ArrayList from the stream 
> 7: the ArrayList has 7 items
> 2: first item in the arraylist - as above, if this was 0, it would be 
> null, since it is positive, read the (value - 1) entry from the string 
> table, and decode that type, so: read a CourseMember object from the payload
> 1: this is _probably_ the number 1 in the first field of the first 
> CourseMember.
> ...
>
> A parser continuing in this way, with knowledge of the structure of these 
> types could be written to decode this payload. I don't know of an 
> off-the-shelf tool that will do it for you in a truly automated way, but 
> could consult to write one, or guide your project in implementing one by 
> hand.
>
> Since you're scraping anyway, consider just scraping the results of the 
> rendered page? This will likely take substantially more CPU time, but 
> ridiculously less developer time to implement. 
>
> On Tuesday, October 8, 2024 at 12:03:30 PM UTC-5 [email protected] wrote:
>
>> Thank you because the detail response.
>>
>> I want to crawl data on a public website, I opened devtools and saw that 
>> it was written by GWT RPC.
>>
>> This is the body of request I saw: 
>>
>> 7|0|10|
>> https://a.b.c.d/e|5C6CDB13D0FD25B266F3C36FA7FF6ED9|a1.a2.a3.DataService|getCourseMembers|java.lang.Long/4227064769|java.lang.String/2004016611|java.util.List|20204524|java.util.Arrays$ArrayList/2507071751|20241|1|2|3|4|3|5|6|7|5|TXbrzIAAA|8|9|1|6|10|
>>
>> As you can see, no problem with that syntax, I can understand roughly, I 
>> know the method is getCourseMembers. I want to build a function should 
>> return above body, like: 
>>                   public static String getBodyEncoded(String methodName, 
>> ... String methodBody ...) or something similar, and return the body above 
>> to send to server.
>>
>> I also want to know the last past of request syntax:
>>                  1|2|3|4|1|5|6|7|7|8|7|9|7|10|7|11|7|12|7|13|7|14|
>>
>> The next is the response body. This is really the problem. A response is 
>> very long, I put it in attached files.
>>
>> I saw a JsonArray with more than 2000 elements, and I cannot understand 
>> what are they. The only thing I understand is the 2042nd element, it 
>> contains an unorder list. Maybe some elements before contains data about 
>> the order.
>>
>> I want to build a method to extract/deserialize this response.
>>
>> I am a newbie, if my question can be completed, can you guide me with 
>> more details, please?
>> Java is good, but other languages are acceptable, I still can deploy it.
>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "GWT 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-web-toolkit/6a2dd93b-d232-497b-a515-b7f6c0a108b9n%40googlegroups.com.

Reply via email to