zyratlo opened a new pull request, #3819:
URL: https://github.com/apache/texera/pull/3819

   **NOTE:** this tool is still in development, design choices and features 
currently present are not finalized
   
   # PR Description
   This PR reintroduces the migration tool branch to the Texera repository 
after it was removed during our transition to an Apache project. The code 
changes included in this PR are purely front-end GUI changes, as the back-end 
is currently a standalone micro-service separate from the Texera codebase.
   
   ## Purpose
   Currently, users who have existing code outside of Texera and want to 
migrate that code to Texera must create a workflow from scratch. This can take 
a long time to do depending on the complexity of the code. This tool aims to 
reduce the amount of time needed migrating to Texera by utilizing large 
language models to migrate Jupyter Notebooks to Texera workflows.
   
   ## Tool Overview (Demo Videos Below)
   The user can upload a Jupyter Notebook which will be given to the OpenAI LLM 
API to migrate into a Texera workflow. Once generated, the user can modify the 
workflow alongside the original notebook until they are satisfied with the 
migration results.
   
   ## Design
   <img width="2187" height="1314" alt="image" 
src="https://github.com/user-attachments/assets/f1793ce7-9eb1-433b-9a0a-169274511e8a";
 />
   The uploaded notebook is passed through the front-end to the migration 
micro-service in the back-end. The micro-service will handle all communication 
with OpenAI. OpenAI returns the generated workflow to the micro-service, which 
passes it to the front-end to render. The communication design with OpenAI is 
shown below:
   <img width="2590" height="973" alt="image" 
src="https://github.com/user-attachments/assets/93c6aff9-d07a-4c90-8635-40f7c5df01dd";
 />
   
   ## Future Work
   - The main concern is the reliability and accuracy of the returned workflow 
from the LLM. The current effort is to research methods to improve this 
concern, such as relying more on algorithmic methods instead of black-box LLM 
results and reducing the dependency on OpenAI.
   - Another effort is to integrate the separate micro-service into the Texera 
back-end.
   
   # Demo
   **1.** User starts with a Jupyter Notebook they want to migrate into Texera.
   
   
https://github.com/user-attachments/assets/88549d9c-92b0-42ce-ba25-5cafceb99daa
   
   **2.** User uploads the Jupyter Notebook using the new tool button.
   
   
https://github.com/user-attachments/assets/ca3621f3-a44a-464a-8996-58edafe94137
   
   **3.** User can view the uploaded notebook from within Texera.
   
   
https://github.com/user-attachments/assets/79f07687-1b88-49b1-8cba-74d7dfa199c1
   
   **4.** Depending on the notebook size and complexity, generation can take 
between one to three minutes. After the workflow is generated, the user can 
begin editing.
   
   
https://github.com/user-attachments/assets/291b4164-b750-49cf-b37c-2d4bcbba87fb
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to