Hi, I was thinking about the timeline of tasks.
The main tasks are: 1. Add instrumentation to identify the slow parts of processing new revisions 2. Improve the performance of these slow parts I'm writing some ideas I have to divide the tasks in small steps, see what you think about it. About the first task I understand that the whole thing starts with identifying how the data for new revisions arrives on Guix Data Service: the relevant queries and their processing on the code. Based on it I would propose start with mapping these queries and their uses, so I could run them locally and get their statistics. Once I get this information I could identify which are the possible problematic ones and work on them. If the process is slow but the query is not, maybe the problem would be hidden in the code. About the improvements on the performance of slow parts, it is a little bit abstract for me to see now how to break it in smaller tasks. I do believe that it would require to reformulate some parts of the queries, and as their result may change a bit, tweaks could be required on the code too. My point is, how would I propose an improvement approach if I don't even know what exactly is to be improved? But I imagine that work on this second task is more demanding than the first and will take most of the time of the internship. I appreciate if you could clarify some of these ideas I mentioned. -- Best Regards, Luciana Lima Brito MSc. in Computer Science
